How to convert RDD to DataFrame in Spark Streaming, not just Spark

后端 未结 2 2020
慢半拍i
慢半拍i 2021-01-12 20:59

How can I convert RDD to DataFrame in Spark Streaming, not just Spark?

I saw this example, but it requires

相关标签:
2条回答
  • 2021-01-12 21:19

    Create sqlContext outside foreachRDD ,Once you convert the rdd to DF using sqlContext, you can write into S3.

    For example:

    val conf = new SparkConf().setMaster("local").setAppName("My App")
    val sc = new SparkContext(conf)
    val sqlContext = new SQLContext(sc) 
    import sqlContext.implicits._
    myDstream.foreachRDD { rdd =>
    
        val df = rdd.toDF()
        df.write.format("json").saveAsTextFile("s3://iiiii/ttttt.json")
    }
    

    Update:

    Even you can create sqlContext inside foreachRDD which is going to execute on Driver.

    0 讨论(0)
  • 2021-01-12 21:34

    Look at the following answer which contains a scala magic cell inside a python notebook: How to convert Spark Streaming data into Spark DataFrame

    0 讨论(0)
提交回复
热议问题