How can one list all csv files in an HDFS location within the Spark Scala shell?

后端 未结 3 634
温柔的废话
温柔的废话 2021-01-05 06:27

The purpose of this is in order to manipulate and save a copy of each data file in a second location in HDFS. I will be using

RddName.coalesce(1).saveAsTex         


        
3条回答
  •  醉梦人生
    2021-01-05 06:54

    sc.wholeTextFiles(path) should help. It gives an rdd of (filepath, filecontent).

提交回复
热议问题