问题
I wish to build a spark structured streaming job that does something like below(lookup a huge non-static dataset)
- Read from kafka(json record)
For each json record
- Get {user_key}
- Read from huge Phoenix table(non-static) filter by {user_key}
- Further DF transformations
- Write to another phoenix table
How to lookup huge volume non-static dataset per kafka message?
来源:https://stackoverflow.com/questions/62421785/spark-structured-streaming-ways-to-lookup-high-volume-non-static-dataset