Does azure stream analytics read data coming from all partitions

若如初见. 提交于 2019-12-11 04:25:13

问题


Azure event hub has partition feature for scalability. While reading data using app service, one eventprocessorHost can be tied to one partition only. There is no way to act collectively on data coming from multiple partitions. But while using Stream analytics, we can aggregate data based on time. So, does it take care of all the partitions while aggregating the data? Means, if reading are passed to 8 partitions, aggregate should includes all these readings in calculation. Thanks


回答1:


Yes. Based on the documentation there are a couple of scenario's.

When the output does support partitioning as well, like another Event Hub, you can use the Partition By:

you must make sure that your query is partitioned. This requires you to use Partition By in all the steps. Multiple steps are allowed, but they all must be partitioned by the same key. Currently, the partitioning key must be set to PartitionId in order for the job to be fully parallel.

When the output does not have support for partitioning (like Power BI) data is read without taking in the origin partition data (and so it will read from all partitions).




回答2:


If you don't use partition by partitionid, data from all input partitions will be merged before the aggregation. Ordering of events will be based on timestamp (either arrival or application). This does mean that lack of data in one partition can block the result, amount of time to block is controlled by late arrival window.

[This page] (https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-out-of-order-and-late-events) has additional details about late arrival window with examples.



来源:https://stackoverflow.com/questions/46129842/does-azure-stream-analytics-read-data-coming-from-all-partitions

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!