PySpark groupByKey returning pyspark.resultiterable.ResultIterable

后端 未结 6 1246
不思量自难忘°
不思量自难忘° 2021-01-30 16:24

I am trying to figure out why my groupByKey is returning the following:

[(0, ), (1, 

        
6条回答
  •  孤街浪徒
    2021-01-30 16:49

    In addition to above answers, if you want the sorted list of unique items, use following:

    List of Distinct and Sorted Values

    example.groupByKey().mapValues(set).mapValues(sorted)
    

    Just List of Sorted Values

    example.groupByKey().mapValues(sorted)
    

    Alternative's to above

    # List of distinct sorted items
    example.groupByKey().map(lambda x: (x[0], sorted(set(x[1]))))
    
    # just sorted list of items
    example.groupByKey().map(lambda x: (x[0], sorted(x[1])))
    

提交回复
热议问题