How to refer a map column in a spark-sql query?

允我心安 提交于 2021-01-28 19:11:42

问题


scala> val map1 = spark.sql("select map('p1', 's1', 'p2', 's2')")

map1: org.apache.spark.sql.DataFrame = [map(p1, s1, p2, s2): map<string,string>]

scala> map1.show()

+--------------------+
| map(p1, s1, p2, s2)|
+--------------------+
|[p1 -> s1, p2 -> s2]|
+--------------------+
scala> spark.sql("select element_at(map1, 'p1')")

org.apache.spark.sql.AnalysisException: cannot resolve 'map1' given input columns: []; line 1 pos 18; 'Project [unresolvedalias('element_at('map1, p1), None)]

How can we reuse the dataframe map1 in second sql query?


回答1:


map1 is a dataframe with a single column of type map. This column has the name map(p1, s1, p2, s2). The dataframe can be queried for example with selectExpr:

map1.selectExpr("element_at(`map(p1, s1, p2, s2)`, 'p1')").show()

prints

+-----------------------------------+
|element_at(map(p1, s1, p2, s2), p1)|
+-----------------------------------+
|                                 s1|
+-----------------------------------+

Another option is to register the dataframe as temporary view and then use a sql query:

map1.createOrReplaceTempView("map1")
spark.sql("select element_at(`map(p1, s1, p2, s2)`, 'p1') from map1").show()

which prints the same result.



来源:https://stackoverflow.com/questions/64107171/how-to-refer-a-map-column-in-a-spark-sql-query

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!