How to get distinct rows in dataframe using pyspark?

前端 未结 2 508
遇见更好的自我
遇见更好的自我 2021-01-04 04:49

I understand this is just a very simple question and most likely have been answered somewhere, but as a beginner I still don\'t get it and am looking for your enlightenment,

2条回答
  •  一整个雨季
    2021-01-04 05:21

    If df is the name of your DataFrame, there are two ways to get unique rows:

    df2 = df.distinct()
    

    or

    df2 = df.drop_duplicates()
    

提交回复
热议问题