How can we read very large xml files from Apache Spark?

后端未结

关注

 0  931

i want to read a very large dataset of xml files ( each xml file size = 1TB) on spark and start a parsing process on each file so that in the end I get csv files as tables.