MapReduce error when selecting column from JSON file in Cosmos

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-24 00:54:30

问题


The problem is the following:

After having created a table with Cygnus 0.2.1, I receive a MapReduce error when trying to select a column from Hive. If we see the files created in hadoop by Cygnus, we can see that the format used is JSON. This problem didn't appear in previous versions of Cygnus as it was creating hadoop files in CSV format.

In order to test it, I left 2 tables created reading from each format. You can compare and see the error with the following queries:

SELECT entitytype FROM fiware_ports_meteo; (it fails, created with 0.2.1 in JSON format)
SELECT entitytype FROM fiware_test_table; (it works, created with 0.2 in CSV format)

The path to the HDFS files are, respectively:

/user/fiware/ports/meteo
/user/fiware/testTable/

I suspect the error comes from parsing the JSON file by the MapReduce job since the CSV format works as expected.

How can this issue be avoided?


回答1:


You simply have to add the Json serde to the Hive classpath. As a not priviledged user, you can do that from the Hive CLI:

hive> ADD JAR /usr/local/hive-0.9.0-shark-0.8.0-bin/lib/json-serde-1.1.9.3-SNAPSHOT.jar;

If you have developed a remote Hive client, you can perform the same operation as any other query execution. Let's say you are using Java:

Statement stmt = con.createStatement();
stmt.executeQuery(“ADD JAR /usr/local/hive-0.9.0-shark-0.8.0-bin/lib/json-serde-1.1.9.3-SNAPSHOT.jar”);
stmt.close();


来源:https://stackoverflow.com/questions/25024342/mapreduce-error-when-selecting-column-from-json-file-in-cosmos

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!