Should I put my events inside a queue after getting them from Azure Event Hub?

拜拜、爱过 提交于 2019-12-05 04:51:15

This falls into the category of question whose answer will be much more obvious once the source for EventProcessorHost is made available, which I've been told is going to happen.

The short answer is that you don't need to use a queue; however, I would keep the time it takes ProcessEventsAsync to return a Task relatively short.

While this advice sounds a lot like that of the first article, the key distinction is that it is the time to returning a Task not the time to Task completion. My assumption has been that ProcessEventsAsync is called on a thread used for the EventProcessorHost for other purposes. In this case you need to return quickly so that the other work can continue; this work might be calling ProcessEventsAsync for another partition (but we won't know without debugging I haven't found it necessary to do or reading the code when available).

I do my processing on a separate thread per partition by passing along the entire IEnumerable from ProcessEventsAsync. This is in contrast to taking all the items out of the IEnumerable and putting them into a Queue for the processing thread to consume. The other thread completes the Task returned by ProcessEventsAsync when it has finished processing the messages. (I actually give my processing thread a single IEnumerable which hides the details of ProcessEventsAsync by chaining the chunks together and completing the Task if needed on call to MoveNext).

So in short: In ProcessEventsAsync hand off the work to another thread, either one you already had lying around that you know how to communicate with or kick off a new Task with the TPL.

Putting all the messages into a Queue inside of ProcessEventsAsync isn't bad it's just not the most efficient way to pass the chunk of events to another thread.

If you decide to put the events into a queue (OR have a queue downstream in your processing code) and complete the task for the batch, you should make sure you limit the number of items you have outstanding in your code/queue to avoid running out of memory in the case where the EventHub is giving you items faster than your code can process them due to a traffic spike.

Note for Java EventHub Users 2016-10-27: Since this came to my attention there's this description describing how onEvents is called, while onEvents being slow won't be tragic since it's on a thread per partition, its speed appears to affect the speed with which the next batch is received. Thus depending on how much you care about the latency being quite fast here could be relatively important for your scenario.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!