How to create an empty DataFrame? Why “ValueError: RDD is empty”?

后端 未结 11 1138
孤城傲影
孤城傲影 2021-02-01 03:48

I am trying to create an empty dataframe in Spark (Pyspark).

I am using similar approach to the one discussed here enter link description here, but it is not working.

11条回答
  •  时光取名叫无心
    2021-02-01 04:30

    extending Joe Widen's answer, you can actually create the schema with no fields like so:

    schema = StructType([])
    

    so when you create the DataFrame using that as your schema, you'll end up with a DataFrame[].

    >>> empty = sqlContext.createDataFrame(sc.emptyRDD(), schema)
    DataFrame[]
    >>> empty.schema
    StructType(List())
    

    In Scala, if you choose to use sqlContext.emptyDataFrame and check out the schema, it will return StructType().

    scala> val empty = sqlContext.emptyDataFrame
    empty: org.apache.spark.sql.DataFrame = []
    
    scala> empty.schema
    res2: org.apache.spark.sql.types.StructType = StructType()    
    

提交回复
热议问题