I am trying to create an empty dataframe in Spark (Pyspark).
I am using similar approach to the one discussed here enter link description here, but it is not working.
This will work with spark version 2.0.0 or more
from pyspark.sql import SQLContext sc = spark.sparkContext schema = StructType([StructField('col1', StringType(), False),StructField('col2', IntegerType(), True)]) sqlContext.createDataFrame(sc.emptyRDD(), schema)