How to calculate Percentile of column in a DataFrame in spark?

前端 未结 2 1916
温柔的废话
温柔的废话 2021-02-20 08:57

I am trying to calculate percentile of a column in a DataFrame? I cant find any percentile_approx function in Spark aggregation functions.

For e.g. in Hive we have perc

2条回答
  •  一整个雨季
    2021-02-20 09:24

    Since Spark2.0, things are getting easier,simply use this function in DataFrameStatFunctions like :

    df.stat.approxQuantile("Open_Rate",Array(0.25,0.50,0.75),0.0)

    There are also some useful statistic functions for DataFrame in DataFrameStatFunctions.

提交回复
热议问题