Best way to get the max value in a Spark dataframe column

后端 未结 13 914
一整个雨季
一整个雨季 2020-12-07 10:27

I\'m trying to figure out the best way to get the largest value in a Spark dataframe column.

Consider the following example:

df = spark.createDataFra         


        
13条回答
  •  有刺的猬
    2020-12-07 10:40

    I believe the best solution will be using head()

    Considering your example:

    +---+---+
    |  A|  B|
    +---+---+
    |1.0|4.0|
    |2.0|5.0|
    |3.0|6.0|
    +---+---+
    

    Using agg and max method of python we can get the value as following :

    from pyspark.sql.functions import max df.agg(max(df.A)).head()[0]

    This will return: 3.0

    Make sure you have the correct import:
    from pyspark.sql.functions import max The max function we use here is the pySPark sql library function, not the default max function of python.

提交回复
热议问题