How to access values in array column?

后端 未结 4 760
眼角桃花
眼角桃花 2021-02-04 03:18

I have a Dataframe with one column. Each row of that column has an Array of String values:

Values in my Spark 2.2 Dataframe

[\"123\", \"abc\", \"2017\         


        
4条回答
  •  无人及你
    2021-02-04 04:22

    What is the best way to access elements in the array?

    Accessing elements in an array column is by getItem operator.

    getItem(key: Any): Column An expression that gets an item at position ordinal out of an array, or gets a value by key key in a MapType.

    You could also use (ordinal) to access an element at ordinal position.

    val ds = Seq(
      Array("123", "abc", "2017", "ABC"),
      Array("456", "def", "2001", "ABC"),
      Array("789", "ghi", "2017", "DEF")).toDF("col")
    scala> ds.printSchema
    root
     |-- col: array (nullable = true)
     |    |-- element: string (containsNull = true)
    scala> ds.select($"col"(2)).show
    +------+
    |col[2]|
    +------+
    |  2017|
    |  2001|
    |  2017|
    +------+
    

    It's just a matter of personal choice and taste which approach suits you better, i.e. getItem or simply (ordinal).

    And in your case where / filter followed by select with distinct give the proper answer (as @Will did).

提交回复
热议问题