How to filter based on array value in PySpark?

前端 未结 2 1791
甜味超标
甜味超标 2020-12-16 15:58

My Schema:

|-- Canonical_URL: string (nullable = true)
 |-- Certifications: array (nullable = true)
 |    |-- elemen         


        
2条回答
  •  有刺的猬
    2020-12-16 16:18

    In spark 2.4 you can filter array values using filter function in sql API.

    https://spark.apache.org/docs/2.4.0/api/sql/index.html#filter

    Here's example in pyspark. In the example we filter out all array values which are empty strings:

    df = df.withColumn("ArrayColumn", expr("filter(ArrayColumn, x -> x != '')"))
    

提交回复
热议问题