I have saved a remote DB table in Hive using saveAsTable
method, now when i try to access the Hive table data using CLI command select * from table_name>
One more way to catch possible discrepancy is to eyeball the difference in schemata of parquet files produced by both sources, say hive and spark. You can dump schema with parquet-tools (brew install parquet-tools
for macos):
λ $ parquet-tools schema /usr/local/Cellar/apache-drill/1.16.0/libexec/sample-data/nation.parquet
message root {
required int64 N_NATIONKEY;
required binary N_NAME (UTF8);
required int64 N_REGIONKEY;
required binary N_COMMENT (UTF8);
}