Can an Airflow task dynamically generate a DAG at runtime?

前端 未结 1 507
小蘑菇
小蘑菇 2020-12-20 04:45

I have an upload folder that gets irregular uploads. For each uploaded file, I want to spawn a DAG that is specific to that file.

My first thought was to do this with

相关标签:
1条回答
  • 2020-12-20 05:36

    In short: if the task writes where the DagBag reads from, yes, but it's best to avoid a pattern that requires this. Any DAG you're tempted to custom-create in a task should probably instead be a static, heavily parametrized, conditionally-triggered DAG. y2k-shubham provides an excellent example of such a setup, and I'm grateful for his guidance in the comments on this question.

    That said, here are the approaches that would accomplish what the question is asking, no matter how bad of an idea it is, in the increasing degree of ham-handedness:

    • If you dynamically generate DAGs from a Variable (like so), modify the Variable.
    • If you dynamically generate DAGs from a list of config files, add a new config file to wherever you're pulling config files from, so that a new DAG gets generated on the next DAG collection.
    • Use something like Jinja templating to write a new Python file in the dags/ folder.

    To retain access to the task after it runs, you'd have to keep the new DAG definition stable and accessible on future dashboard updates / DagBag collection. Otherwise, the Airflow dashboard won't be able to render much about it.

    0 讨论(0)
提交回复
热议问题