How to create an empty DataFrame? Why “ValueError: RDD is empty”?

后端 未结 11 1092
孤城傲影
孤城傲影 2021-02-01 03:48

I am trying to create an empty dataframe in Spark (Pyspark).

I am using similar approach to the one discussed here enter link description here, but it is not working.

11条回答
  •  孤城傲影
    2021-02-01 04:28

    You can do it by loading an empty file (parquet, json etc.) like this:

    df = sqlContext.read.json("my_empty_file.json")
    

    Then when you try to check the schema you'll see:

    >>> df.printSchema()
    root
    

    In Scala/Java not passing a path should work too, in Python it throws an exception. Also if you ever switch to Scala/Python you can use this method to create one.

提交回复
热议问题