Storm vs. Trident: When not to use Trident?

爷,独闯天下 提交于 2019-11-28 15:50:07

问题


I'm working with Storm and it is fine for a lot of use cases. Recently I had a look at Trident, which is a high-level abstraction of Storm. It supports exactly-once processing and makes stateful processing easier.

But now I'm wondering.. Why can't I always use Trident instead of Storm?

What I read so far:

  • Trident processes messages in batches, so throughput time could be longer.
  • Trident is not yet able to process loops in topologies.

Are there any other disadvantages when using Trident instead of Storm? Because right now, I think the disadvantages I listed above are marginal.

What use cases cannot be implemented with Trident?


Aftermath:

Since I asked the question my company decided to go for Trident first. We will only use pure Storm when there are performance problems. Sadly this wasn't an active decision it just became the default behavior (I wasn't around at that time).

Their assumption was that in most use cases we need state or only-once-processing or we will need it in near future. I understand their reasoning because moving from Storm to Trident or back isn't an easy transformation, but in my personal opinion the concept of stream processing without state wasn't understood by all and that was the main reason to use Trident.


回答1:


To answer your question: when shouldn't you use Trident? Whenever you can afford not to.

Trident adds complexity to a Storm topology, lowers performance and generates state. Ask yourself the question: do you need the "exactly once" processing semantics of Trident or can you live with the "at least once" processing semantics of Storm. For exactly once, use Trident, otherwise don't.

I would also just like to highlight the fact that Storm guarantees that all messages will be processed. Some messages might just be processed more than once.




回答2:


If the lowest possible latency is your goal and you don't need exactly-once processing, then using Storm is better than Trident.




回答3:


Trident is a high-level abstraction for doing realtime computing on top of Twitter Storm, available in Storm 0.8.x. Storm is stateless stream processing framework and Trident provides stateful stream processing.




回答4:


Chris, since these two of them are open source technologies, trident serves as an only an implementation of a scenario on top of the storm, of course, this brought a performance overhead. If the trident could not meet your requirements, you create your own state implementation on top of the storm. Trident yielded higher level projects such as Trident-ML in time.




回答5:


assume we want to do filtering + addition of a field to a tuple. if we use storm usually we use 2 bots for filtering , addition of field. so again we need to send the tuple to new bolt by may be using global grouping. so here nw bandwidth may become bottleneck.

by using trident we can use do above on a single machine. so no regrouping is needed in this case. such use case in addition to "exactly once" /"at east once" can differentiate what to use etc.

Trident is kind of grouping logical grouping



来源:https://stackoverflow.com/questions/15520993/storm-vs-trident-when-not-to-use-trident

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!