Azure eventhub multiple partition key points to same partition

筅森魡賤 提交于 2019-12-12 21:20:04

问题


We are working on a multi-tenant application where eventhub will be shared among different tenants. We will distribute the partitions among our tenants. Every tenant will sending message on different partition. We want to authenticate tenant on partition level. As describe on Microsoft site, we defined partition key based on tenant Id. But the problem is more then one partition key is sending message on same partition. Which should not be the case. Ideally every partition key should be mapped to different partition.

        var serviceNamespace = "namespace name here";
        var hubName = "hub name here";
        var deviceName = "device name here";
        var sasToken = "SAS TOKEN HERE";

        Mymessage subGroup1 = CreateMessage();

        var factory = MessagingFactory.Create(ServiceBusEnvironment.CreateServiceUri("sb", serviceNamespace, ""), new MessagingFactorySettings
        {
            TokenProvider = TokenProvider.CreateSharedAccessSignatureTokenProvider(sasToken),
            TransportType = TransportType.Amqp
        });
        var client = factory.CreateEventHubClient(String.Format("{0}/publishers/{1}", hubName, deviceName));

        var data = new EventData(Encoding.UTF8.GetBytes(JsonConvert.SerializeObject(subGroup1)));
        data.PartitionKey = "jeep";

        client.Send(data);

Please help me understand what is wrong with my approach.


回答1:


Quite simply the ~infinite space of possible string partition keys is mapped to the very finite space of partitions within the Event Hub. Unless you asked Microsoft for more you have at most 32 partitions within your EventHub. The partition key gets hashed and then the hash space is divided amongst the space of partitions. This provides the guarantee in the documentation

Event Hubs ensures that any and all events sharing the same partition key value are delivered in order, and to the same partition. Importantly, if partition keys are used with publisher policies, described in the next section, then the identity of the publisher and the value of the partition key must match.

What is not guaranteed, and could not be guaranteed in this sort of system with good performance, is that each partition key goes to a different partition. Some of this is discussed in this question. With publisher policies you also know that

When using publisher policies, the PartitionKey value is set to the publisher name. In order to work properly, these values must match.

which means all events from a single publisher go to a single partition. Personally, I don't think that's always a good thing because you end up with a hard cap of one throughput unit (less if you're unlucky in hashing) per publisher.

If you need separation of each customer's data in different partitions with this being enforced by the credentials given to the customer to talk directly to the EventHub, I think your only option may be to use multiple EventHubs. I believe (in the sense I haven't examined our bill yet) that EventHubs within the same Service Bus share throughput units.

However, if you just need your consumer to be able to tell what publisher/customer it came from, then I believe you can just use EventData.PartitionKey which is guaranteed to be the publisher name as documented above.



来源:https://stackoverflow.com/questions/28292330/azure-eventhub-multiple-partition-key-points-to-same-partition

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!