Scheduling dag runs in Airflow

﹥>﹥吖頭↗ 提交于 2019-12-04 03:46:54

问题


Got a general query on Airflow

Is it possible to have a dag file scheduled based on another dag file's schedule.

For example, if I have 2 dags namely dag1 and dag2. I am trying to see if I can have dag2 run each time dag1 is successful else dag2 does not run. Is this possible in Airflow.


回答1:


You will want to add a TriggerDagRunOperator the end of dag1 and set the schedule of dag2 to None.

In addition, if you want to handle multiple cases for the output of dag1, you can add in a BranchPythonOperator to create multiple paths based on its output. For example, you could set it to either execute the TriggerDagRunOperator on success or Slack you "Warning! Task Failure in DAG1!" with the SlackAPIPostOperator if an error is thrown (or any other logic you want to build in).

If you don't care about multiple outcomes, you could also just use the ShortCircuitOperator before the TriggerDagRunOperator to prevent it from running based on the dag1 output.




回答2:


Yes. You should include a task at the end of dag1 using the TriggerDagRunOperator. Make sure this task has a trigger rule that will only allow it to run if all other tasks upstream succeed.

The other answer recommends subdags which can have some oddities that make them less than ideal in complex Airflow environments.




回答3:


The DAG schedule interval must be defined as one of:

  • a cron schedule
  • a preset e.g., '@once', '@hourly', etc
  • None*

*For the null schedule use case, the DAG won't run automatically and must be triggered somehow.

One way to trigger a DAG is to use SubDAGs via the SubDagOperator. I think SubDAGs are probably the best option for your use case given that you want the second DAG to be triggered as a result of the first DAG succeeding. There's some nuance to SubDAGs as described in the docs.

The SubDAG will automatically run if the task before it succeeds and skip if the task before it fails/skips assuming you're using ALL_SUCCESS or ONE_SUCCESS as your trigger rule.

[This approach is somewhat similar to the TriggerDagRunOperator operator which is another option detailed in @andscoop's answer.]

Another way to trigger a DAG is with an external trigger. This idea is discussed in more detail in this answer.



来源:https://stackoverflow.com/questions/50611039/scheduling-dag-runs-in-airflow

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!