Why is producer in pykafka so slow?

前提是你 提交于 2019-12-24 01:33:26

问题


I wrote a simple producer using pykafka but can't seem to get it to perform. The basic producer and call to produce is below. When I call this 100 times with a small message, and add some timing/profiling code, it takes about 14 seconds. I understand this to be an asynchronous sending of messages so I would expect it to be incredibly fast. Is there some setting I'm missing? I've also tried it with min_queued_messages=1 and those takes about 2 seconds longer.

from pykafka import KafkaClient
import time

client = KafkaClient(hosts="kafka1.mydomain.com:9092", exclude_internal_topics=False)
topic = client.topics['mytopic']

start = time.time()

for x in xrange(100):
    with topic.get_producer(delivery_reports=False,
                            sync=True,
                            linger_ms=0) as producer:
        producer.produce("This is a message")

end = time.time()
print "Execution Time (ms): %s" % round((end - start) * 1000)

I did do a profile of this within pycharm and is says hat 78.8% of the time is spent on "time.sleep"?! Why would it be sleeping?


回答1:


The topic.get_producer call is meant to be called once at the beginning of the producer's lifespan. Calling it in a tight loop as your example code does will cause the initialization sequence to be run repeatedly, which is unnecessary and will add a lot of overhead. Your code would work faster if it were changed to the following:

with topic.get_producer(delivery_reports=False,
                        sync=True,
                        linger_ms=0) as producer:
    for x in xrange(100):
        producer.produce("This is a message")


来源:https://stackoverflow.com/questions/49616402/why-is-producer-in-pykafka-so-slow

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!