Im working on Hive tables im having the following problem. I am having more than 1 billion of xml files in my HDFS. What i want to do is, Each xml file having the 4 differe
You have several options:
CREATE TABLE xmlfiles (id int, xmlfile string)
. Then use an XPath UDF to do work on the XML.//section1
), follow the instructions in the second half of this tutorial to ingest directly into Hive via XPath.It depends on your level of experience and comfort with these approaches.