Im working on Hive tables im having the following problem. I am having more than 1 billion of xml files in my HDFS. What i want to do is, Each xml file having the 4 differe
You have several options:
CREATE TABLE xmlfiles (id int, xmlfile string)
. Then use an XPath UDF to do work on the XML.//section1
), follow the instructions in the second half of this tutorial to ingest directly into Hive via XPath.It depends on your level of experience and comfort with these approaches.
Use this:
CREATE EXTERNAL TABLE test(name STRING) LOCATION '/user/sornalingam/zipped/output/Tagged/t1'
tblproperties ("skip.header.line.count"="1", "skip.footer.line.count"="1");
And then use xpath function