Spark: Saving RDD in an already existing path in HDFS

后端 未结 1 1657
南笙
南笙 2021-01-24 16:34

I am able to save the RDD output to HDFS with saveAsTextFile method. This method throws an exception if the file path already exists.

I have a use case

相关标签:
1条回答
  • 2021-01-24 17:09

    One possible solution, available since Spark 1.6, is to use DataFrames with text format and append mode:

    val outputPath: String = ???
    
    rdd.map(_.toString).toDF.write.mode("append").text(outputPath)
    
    0 讨论(0)
提交回复
热议问题