How to count occurrences of each distinct value for every column in a dataframe?

前端 未结 6 1246
感动是毒
感动是毒 2021-02-01 03:48

edf.select(\"x\").distinct.show() shows the distinct values that are present in x column of edf DataFrame.

Is there an efficient

6条回答
  •  栀梦
    栀梦 (楼主)
    2021-02-01 04:02

    If you are using Java, the import org.apache.spark.sql.functions.countDistinct; will give an error : The import org.apache.spark.sql.functions.countDistinct cannot be resolved

    To use the countDistinct in java, use the below format:

    import org.apache.spark.sql.functions.*;
    import org.apache.spark.sql.*;
    import org.apache.spark.sql.types.*;
    
    df.agg(functions.countDistinct("some_column"));
    

提交回复
热议问题