Writing a RDD to a csv

后端 未结 2 1456
[愿得一人]
[愿得一人] 2021-01-18 01:40

I have a RDD which is of the form

org.apache.spark.rdd.RDD[(String, Array[String])]

I want to write this into a csv file. Please suggest m

相关标签:
2条回答
  • 2021-01-18 02:31

    You can try:

    myrdd.map(a => a._1 + "," + a._2.mkString(",")).saveAsTextFile
    
    0 讨论(0)
  • 2021-01-18 02:33

    The other answer doesn't cater for escaping. Perhaps this more general solution?

    import au.com.bytecode.opencsv.CSVWriter
    import java.io.StringWriter
    import scala.collection.JavaConversions._
    val toCsv = (a: Array[String]) => {
      val buf = new StringWriter
      val writer = new CSVWriter(buf)
      writer.writeAll(List(a))
      buf.toString.trim
    }
    rdd.map(t => Array(t._1) ++ t._2)
       .map(a => toCsv(a))
       .saveAsTextFile(dest)
    
    0 讨论(0)
提交回复
热议问题