发表新帖

发表新帖

How to find maximum value of a column in python dataframe

前端未结

关注

 4  502

I have a data frame in pyspark. In this data frame I have column called id that is unique.

Now I want to find the maximum value of

相关标签:

4条回答

梦毁少年i

2021-02-07 16:15
The following can be used in pyspark:
```
df.select(max("id")).show()
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
我在风中等你

2021-02-07 16:21
I'm coming from scala, but I do believe that this is also applicable on python.
```
val max = df.select(max("id")).first()
```
but you have first import the following :
```
from pyspark.sql.functions import max
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
我寻月下人不归

2021-02-07 16:25
if you are using pandas .max() will work :
```
>>> df2=pd.DataFrame({'A':[1,5,0], 'B':[3, 5, 6]})
>>> df2['A'].max()
5
```
Else if it's a spark dataframe:

Best way to get the max value in a Spark dataframe column
0 讨论(0)
发布评论:

提交评论
- 加载中...
独厮守ぢ

2021-02-07 16:32
You can use the aggregate max as also mentioned in the pyspark documentation link below:

Link : https://spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=agg

Code:
```
row1 = df1.agg({"id": "max"}).collect()[0]
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题