why I only can see one spark streaming kafkaReceiver

好久不见. 提交于 2020-01-01 12:41:06

问题


I'm confused why I only can see one KafkaReceiver in spark web UI page(8080), But I do have 10 partitions in Kafka, and I used 10 cores in spark cluster, also my code as follows in python: kvs = KafkaUtils.createStream(ssc, zkQuorum, "spark-streaming-consumer",{topic: 10}) I suppose the KafkaReceivers number should be 10 rather than 1. I’m so confused. thank you in advance!


回答1:


kvs = KafkaUtils.createStream(ssc, zkQuorum, "spark-streaming-consumer",{topic: 10})

That code creates 1 receiver with 10 thread. Each thread will attach to one partition and all data will be pulled by 1 consumer using 1 core. All other cores will (potentially) process the data received.

If you want to have 10 receivers, each one attached to 1 partition, using 1 core you should do this: (in Scala, my Python is weak, but you get the idea):

val recvs = (1 to 10).map(i => KafkaUtils.createStream(ssc, zkQuorum, "spark-streaming-consumer",{topic: 1}) 
val kafkaData = ssc.union(recvs)

Take into account that you will need additional cores for Spark to process the received data.



来源:https://stackoverflow.com/questions/31079655/why-i-only-can-see-one-spark-streaming-kafkareceiver

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!