How to get Apache spark to ignore dots in a query?

后端 未结 1 1816
眼角桃花
眼角桃花 2021-01-24 18:32

Given the following JSON file:

[{\"dog*woof\":\"bad dog 1\",\"dog.woof\":\"bad dog 32\"}]

Why does this Java code fail:

Dat         


        
相关标签:
1条回答
  • 2021-01-24 19:24

    It fails because dots are used to access attributes of the struct fields. You can escape column names using backticks:

    val df = sqlContext.read.json(sc.parallelize(Seq(
       """{"dog*woof":"bad dog 1","dog.woof":"bad dog 32"}"""
    )))
    
    df.groupBy("`dog.woof`").count.show
    // +----------+-----+
    // |  dog.woof|count|
    // +----------+-----+
    // |bad dog 32|    1|
    // +----------+-----+
    

    but using special characters in the names is not a good practice and work with in general.

    0 讨论(0)
提交回复
热议问题