Python based asynchronous workflow modules : What is difference between celery workflow and luigi workflow?

独自空忆成欢 提交于 2019-12-03 02:06:56

Update: As Erik pointed, Celery is better choice for this case.

Celery:

What is Celery?

Celery is a simple, flexible and reliable distributed system to process vast amounts of messages, while providing operations with the tools required to maintain such a system.

Why use Celery?

  • It is simple to use & has lots of features.
  • django-celery: provides good integration with Django.
  • flower: Real-time monitor and web admin for Celery distributed task queue.
  • Active & large community(based on Stackoverflow activity, Pyvideos, tutorials, blog posts).

Luigi

What is Luigi?

Luigi(Spotify's recently open sourced Python framework) is a Python package that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization, handling failures, command line integration, and much more.

Why use Luigi?

  • Builtin support for Hadoop.
  • Generic enough to be used for everything from simple task execution and monitoring on a local work station, to launching huge chains of processing tasks that can run in synchronization between many machines over the span of several days.
  • Lugi's visualiser: Gives a nice visual overview of dependency graph of workflow.

Conclusion: If you need a tool just to simply schedule tasks & run them you can use Celery. If you are dealing with big data & huge processing you can go for Luigi.

(I'm the author of Luigi)

Luigi is not meant for synchronous low-latency framework. It's meant for large batch processes that run for hours or days. So I think for your use case, Celery might actually be slightly better

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!