PySpark - Sum a column in dataframe and return results as int

前端 未结 6 1388
执念已碎
执念已碎 2020-12-24 08:13

I have a pyspark dataframe with a column of numbers. I need to sum that column and then have the result return as an int in a python variable.

df = spark.cr         


        
6条回答
  •  孤城傲影
    2020-12-24 08:28

    I think the simplest way:

    df.groupBy().sum().collect()
    

    will return a list. In your example:

    In [9]: df.groupBy().sum().collect()[0][0]
    Out[9]: 130
    

提交回复
热议问题