Can someone explain this code that uses PySpark RDD, GroupBy and Lambda?

后端未结

关注

 0  382

i=0; col=\'INFANT_ALIVE_AT_REPORT\'
agg = categorical_rdd \\
    .groupBy(lambda row: row[i])

The output when using print(col,agg.collect())


                      
              相关标签: