now has JSON data as follows
{\"Id\":11,\"data\":[{\"package\":\"com.browser1\",\"activetime\":60000},{\"package\":\"com.browser6\",\"activetime\":1205000},{\"pa
From your given json
data you can view the schema of your dataframe
with printSchema
and use it
appActiveTime.printSchema()
root
|-- data: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- activetime: long (nullable = true)
| | |-- package: string (nullable = true)
Since you have array
you need to explode
the data and select the struct field as below
import org.apache.spark.sql.functions._
appActiveTime.withColumn("data", explode($"data"))
.select("data.*")
.show(false)
Output:
+----------+------------+
|activetime| package|
+----------+------------+
| 60000|com.browser1|
| 1205000|com.browser6|
| 1205000|com.browser7|
| 60000|com.browser1|
| 1205000|com.browser6|
+----------+------------+
Hope this helps!
with @Shankar Koirala 's help , I learned how to use ' explode' to handle joson array.
val df = sqlContext.sql("SELECT data FROM behavior")
appActiveTime.select(explode(df("data"))).toDF("data")
.select("data.package","data.activetime")
.show(false)