How to query JSON data column using Spark DataFrames?

后端 未结 5 815
梦毁少年i
梦毁少年i 2020-11-22 01:50

I have a Cassandra table that for simplicity looks something like:

key: text
jsonData: text
blobData: blob

I can create a basic data frame

5条回答
  •  星月不相逢
    2020-11-22 02:15

    The from_json function is exactly what you're looking for. Your code will look something like:

    val df = sqlContext.read
      .format("org.apache.spark.sql.cassandra")
      .options(Map("table" -> "mytable", "keyspace" -> "ks1"))
      .load()
    
    //You can define whatever struct type that your json states
    val schema = StructType(Seq(
      StructField("key", StringType, true), 
      StructField("value", DoubleType, true)
    ))
    
    df.withColumn("jsonData", from_json(col("jsonData"), schema))
    

提交回复
热议问题