spark structured streaming multiple aggregation keys on same stream data

百般思念 提交于 2019-12-12 10:58:19

问题


Am a newbie in Spark and I have this prototype application based on Spark Structured Streaming wherein am continuously reading stream data from Kafka.

On this stream data lets say I have to apply multiple aggregations:

1) group by key1 and generate sum and count
2)group by key1 and key2 and generate count and so on...

If I create the above 2 aggregations as streaming queries , two independent streaming queries are created each reading from kafka independently which is not what I want. Caching the data from kafka and then perform multiple aggregations doesn't seems to be working in Structured streaming.

What is the best way to do multiple aggregation on streaming data ?

Some post suggests flatmapwithGroupState might work for such use case but I can't find any examples for same

来源:https://stackoverflow.com/questions/51495009/spark-structured-streaming-multiple-aggregation-keys-on-same-stream-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!