发表新帖

发表新帖

How to count occurrences of each distinct value for every column in a dataframe?

前端未结

关注

 6  1246

感动是毒 2021-02-01 03:48

edf.select(\"x\").distinct.show() shows the distinct values that are present in x column of edf DataFrame.

Is there an efficient

6条回答

栀梦 (楼主)

2021-02-01 04:02
If you are using Java, the import org.apache.spark.sql.functions.countDistinct; will give an error : The import org.apache.spark.sql.functions.countDistinct cannot be resolved

To use the countDistinct in java, use the below format:
```
import org.apache.spark.sql.functions.*;
import org.apache.spark.sql.*;
import org.apache.spark.sql.types.*;

df.agg(functions.countDistinct("some_column"));
```
0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...

热议问题