How to Remove header and footer from Dataframe?

后端 未结 4 935
小蘑菇
小蘑菇 2021-01-24 07:23

I am reading a text (not CSV) file that has header, content and footer using

spark.read.format(\"text\").option(\"delimiter\",\"|\")...load(file)
4条回答
  •  别那么骄傲
    2021-01-24 08:27

    assuming the file is not so large we can use collect to get the dataframe as iterator and the access the last element as follows:

    df = df.collect()[data.count()-1]
    

    avoid using collect on large datasets.

    or

    we can use take to cut off the last row.

    df = df.take(data.count()-1)
    

提交回复
热议问题