I am trying to create an empty dataframe in Spark (Pyspark).
I am using similar approach to the one discussed here enter link description here, but it is not working.
This is a roundabout but simple way to create an empty spark df with an inferred schema
# Initialize a spark df using one row of data with the desired schema
init_sdf = spark.createDataFrame([('a_string', 0, 0)], ['name', 'index', 'seq_#'])
# remove the row. Leaves the schema
empty_sdf = init_sdf.where(col('name') == 'not_match')
empty_sdf.printSchema()
# Output
root
|-- name: string (nullable = true)
|-- index: long (nullable = true)
|-- seq_#: long (nullable = true)