Airflow: Re-run DAG from beginning with new schedule

好久不见. 提交于 2019-12-23 09:55:49

问题


Backstory: I was running an Airflow job on a daily schedule, with a start_date of July 1, 2019. The job gathered requested each day's data from a third party, then loaded that data into our database.

After running the job successfully for several days, I realized that the third party data source only refreshed their data once a month. As such, I was simply downloading the same data every day.

At that point, I changed the start_date to a year ago (to get previous months' info), and changed the DAG's schedule to run once a month.

How do I (in the airflow UI) restart the DAG completely, such that it recognizes my new start_date and schedule, and runs a complete backfill as if the DAG is brand new?

(I know this backfill can be requested via the command line. However, I don't have permissions for the command line interface and the admin is unreachable.)


回答1:


Click on the green circle in the Dag Runs column for the job in question in the web interface. This will bring you to a list of all successful runs.

Tick the check mark on the top left in the header of the list to select all instances, then in the menu above it choose "With selected" and then "Delete" in the drop down menu. This should clear all existing dag run instances.

If catchup_by_default is not enabled on your Airflow instance, make sure catchup=True is set on the DAG until it has finished catching up.



来源:https://stackoverflow.com/questions/56945611/airflow-re-run-dag-from-beginning-with-new-schedule

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!