Route events to eventhub EventProcessor

萝らか妹 提交于 2019-12-04 06:59:23

Great Question!

Before answering - I wanted to re-iterate couple of principles we followed while building EventHubs.

  • We wanted Event Hubs to be a highly durable, high-throughput, event ingestion pipeline. The major differentiating factor for coming up with a new Service while we already had existing pub-sub services on Azure like Queues/Topics (similar to AWS SQS, Google Pub-sub) - is, to provide higher throughput variant (& of course, with low latency) . We were able to deliver on this goal - with the trade-off that - we don't perform any per-message computations - like executing a Filter etc. on the Service. When you need per-message semantics - like de-dup per message, acknowledge receive per message, in your case, filter based on a property per message - and the throughput requirements are low - Queue/Topic might be your best bet.

  • We also envisioned that, Senders (or publishers) are at a much higher scale and vary significantly based on scenario. So we introduced 3 Sending patterns (Send, Send with PartitionKey, Send directly to a Partition). So, while sending you will notice the notion of PartitionKey - which will in turn translate to a Particular partition (Consider PartitionKey as a Clue to EventHub Service to Calculate placement of all events with the Same PartitionKey to be on Same Partition). But, while consuming Events, there is no notion of PartitionKey directly exposed by EventHubs. There is no relation b/w ConsumerGroups and PartitionKey.

  • and Receivers are usually just the computation roles and are limited in number. So, we exposed 1 generic Receive (consume) pattern - Receive from a Partition. Now, while consuming events, there might be different types of Consumers based on different factors - for ex: the Speed of consumption (Real-time Vs Historical), or type of data - and hence - we exposed multiple consumer groups. Although you could create 20 CGs, one interesting limitation we have here is that - each thruput unit purchased can yield 1 MBPS in and 2 MBPS out - which if fully utilized on Send side will limit it to 2 CGs. So, If you are processing the exact Same stream and have different ways to handle each event but each of them takes equal amount of time to process - then, using the same ConsumerGroup makes more sense.

To answer your question: IT REALLY DEPENDS.

Here are few solutions:

  • Since, there is a mix of event types in your scenario - you will need to foresee/decide if you have any scenarios, where there is a need to read and Process all types of events by a single consumer/processor. One ex: we usually see is - using one ConsumerGroup you want a count of all errors and other consumer group would actually perform specific action per error Type. If, you don't need that - sending each EventType to different eventhubs and then, using 1 consumer group with the specific IEventProcessor - is an option.

  • If you have scenarios where there is a need to Send all events to the same EventHub, and if you know that processing speed of some of the eventTypes is(or need to be) very fast - you should consider using different consumergroup, with Each consumer group tied to a specific IEventProcessor implementation and it will ignore the other EventTypes. For ex: if the ErrorInfo events and Special events need attention at Real-time and if the telemetry data is okay to take a hit of 15 mins due to slow processing or high-peak load times - I would go for one ConsumerGroup and name it Real-time and tie it with IEventProcessor which handles 2 types - Error and Special. Create 2nd ConsumerGroup and tie it with an IEventProcessor which handles Telemetry events.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!