问题
Am a newbie in Spark and I have this prototype application based on Spark Structured Streaming wherein am continuously reading stream data from Kafka.
On this stream data lets say I have to apply multiple aggregations:
1) group by key1 and generate sum and count
2)group by key1 and key2 and generate count
and so on...
If I create the above 2 aggregations as streaming queries , two independent streaming queries are created each reading from kafka independently which is not what I want. Caching the data from kafka and then perform multiple aggregations doesn't seems to be working in Structured streaming.
What is the best way to do multiple aggregation on streaming data ?
Some post suggests flatmapwithGroupState might work for such use case but I can't find any examples for same
来源:https://stackoverflow.com/questions/51495009/spark-structured-streaming-multiple-aggregation-keys-on-same-stream-data