How to find maximum value of a column in python dataframe

前端 未结 4 488
遥遥无期
遥遥无期 2021-02-07 15:36

I have a data frame in pyspark. In this data frame I have column called id that is unique.

Now I want to find the maximum value of

相关标签:
4条回答
  • 2021-02-07 16:15

    The following can be used in pyspark:

    df.select(max("id")).show()
    
    0 讨论(0)
  • 2021-02-07 16:21

    I'm coming from scala, but I do believe that this is also applicable on python.

    val max = df.select(max("id")).first()
    

    but you have first import the following :

    from pyspark.sql.functions import max
    
    0 讨论(0)
  • 2021-02-07 16:25

    if you are using pandas .max() will work :

    >>> df2=pd.DataFrame({'A':[1,5,0], 'B':[3, 5, 6]})
    >>> df2['A'].max()
    5
    

    Else if it's a spark dataframe:

    Best way to get the max value in a Spark dataframe column

    0 讨论(0)
  • 2021-02-07 16:32

    You can use the aggregate max as also mentioned in the pyspark documentation link below:

    Link : https://spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=agg

    Code:

    row1 = df1.agg({"id": "max"}).collect()[0]
    
    0 讨论(0)
提交回复
热议问题