How to convert DataFrame to RDD in Scala?

后端 未结 3 1345
甜味超标
甜味超标 2021-02-01 02:11

Can someone please share how one can convert a dataframe to an RDD?

相关标签:
3条回答
  • 2021-02-01 02:18

    Simply:

    val rows: RDD[Row] = df.rdd
    
    0 讨论(0)
  • 2021-02-01 02:19

    I was just looking for my answer and found this post.

    Jean's answer to absolutely correct,adding on that "df.rdd" will return a RDD[Rows]. I need to apply split() once i get RDD. For that we need to convert RDD[Row} to RDD[String]

    val opt=spark.sql("select tags from cvs").map(x=>x.toString()).rdd
    
    0 讨论(0)
  • 2021-02-01 02:35

    Use df.map(row => ...) to convert the dataframe to a RDD if you want to map a row to a different RDD element. For example

    df.map(row => (row(1), row(2)))
    

    gives you a paired RDD where the first column of the df is the key and the second column of the df is the value.

    0 讨论(0)
提交回复
热议问题