PySpark - Sum a column in dataframe and return results as int

前端 未结 6 1389
执念已碎
执念已碎 2020-12-24 08:13

I have a pyspark dataframe with a column of numbers. I need to sum that column and then have the result return as an int in a python variable.

df = spark.cr         


        
6条回答
  •  孤城傲影
    2020-12-24 08:23

    If you want a specific column :

    import pyspark.sql.functions as F     
    
    df.agg(F.sum("my_column")).collect()[0][0]
    

提交回复
热议问题