I\'m having a problem to read partitioned parquet files generated by Spark in Hive. I\'m able to create the external table in hive but when I try to select a few lines, hive
I finally found the problem. When you create tables in Hive, where partitioned data already exists in S3 or HDFS, you need to run a command to update the Hive Metastore with the table's partition structure. Take a look here: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-RecoverPartitions(MSCKREPAIRTABLE)
The commands are:
MSCK REPAIR TABLE table_name;
And on Hive running in Amazon EMR you can use:
ALTER TABLE table_name RECOVER PARTITIONS;