Read a json file with 12 nested level into hive in AZURE hdinsights

问题

I tried to create a schema for the json file manually and tried to create a Hive table and i am getting column type name length 10888 exceeds max allowed length 2000.

I am guessing i have to change the metastore details but i am not sure where is the config located In azure Hdinsights .

Other way I tried was I got the schema from spark dataframe and i tried to create table from the view but still I get the same error.

this are the steps i tried in spark

val tne1 = sc.wholeTextFiles("wasb:path").map(x=>x._2)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
val tne2 = sqlContext.read.json(tne1)   
tne2.createOrReplaceTempView("my_temp_table");
sqlContext.sql("create table s  ROW FORMAT SERDE  'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES (  'hive.serialization.extend.nesting.levels'='true') as select * from my_temp_table")

i am getting the error in this step

org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: InvalidObjectException(message:Invalid column type name length 5448 exceeds max allowed length 2000, type struct

when i try to persist or create the rdd i get the schema but in a formatted view . even if i get the full view i might extract the schema .

回答1:

I Added the following property through Ambari > Hive > Configs > Advanced > Custom hive-site: hive.metastore.max.typename.length=14000. and now i am able to create table with column type name upto 14000 length

来源：https://stackoverflow.com/questions/46193121/read-a-json-file-with-12-nested-level-into-hive-in-azure-hdinsights

标签

json

Hive

apache-spark-sql

spark-dataframe

hdinsight