Moving Spark DataFrame from Python to Scala whithn Zeppelin

前端 未结 1 1933
遥遥无期
遥遥无期 2021-02-09 15:52

I created a spark DataFrame in a Python paragraph in Zeppelin.

sqlCtx = SQLContext(sc)
spDf = sqlCtx.createDataFrame(df)
<         


        
1条回答
  •  囚心锁ツ
    2021-02-09 15:58

    You canput internal Java object not a Python wrapper:

    %pyspark
    
    df = sc.parallelize([(1, "foo"), (2, "bar")]).toDF(["k", "v"])
    z.put("df", df._jdf)
    

    and then make sure you use correct type:

    val df = z.get("df").asInstanceOf[org.apache.spark.sql.DataFrame]
    // df: org.apache.spark.sql.DataFrame = [k: bigint, v: string]
    

    but it is better to register temporary table:

    %pyspark
    
    # registerTempTable in Spark 1.x
    df.createTempView("df")
    

    and use SQLContext.table to read it:

    // sqlContext.table in Spark 1.x
    val df = spark.table("df")
    
    df: org.apache.spark.sql.DataFrame = [k: bigint, v: string]
    

    To convert in the opposite direction see Zeppelin: Scala Dataframe to python

    0 讨论(0)
提交回复
热议问题