How to create an empty DataFrame? Why “ValueError: RDD is empty”?

后端 未结 11 1139
孤城傲影
孤城傲影 2021-02-01 03:48

I am trying to create an empty dataframe in Spark (Pyspark).

I am using similar approach to the one discussed here enter link description here, but it is not working.

11条回答
  •  春和景丽
    2021-02-01 04:27

    This is a roundabout but simple way to create an empty spark df with an inferred schema

    # Initialize a spark df using one row of data with the desired schema   
    init_sdf = spark.createDataFrame([('a_string', 0, 0)], ['name', 'index', 'seq_#'])
    # remove the row.  Leaves the schema
    empty_sdf = init_sdf.where(col('name') == 'not_match')  
    empty_sdf.printSchema()
    # Output
    root
     |-- name: string (nullable = true)
     |-- index: long (nullable = true)
     |-- seq_#: long (nullable = true)
    

提交回复
热议问题