Airflow does not backfill latest run

前端 未结 2 1071
南笙
南笙 2021-01-15 04:43

For some reason, Airflow doesn\'t seem to trigger the latest run for a dag with a weekly schedule interval.

Current Date:

$ date
$ Tue Aug  9 17:09:5         


        
相关标签:
2条回答
  • 2021-01-15 04:51

    Airflow always schedules for the previous period. So if you have a dag that is scheduled to run daily, on Aug 9th, it will schedule a run with execution_date Aug 8th. Similarly if the schedule interval is weekly, then on Aug 9th, it will schedule for 1 week back i.e. Aug 2nd, though this gets run on Aug 9th itself. This is just airflow bookkeeping. You can find this in the airflow wiki (https://cwiki.apache.org/confluence/display/AIRFLOW/Common+Pitfalls):

    Understanding the execution date Airflow was developed as a solution for ETL needs. In the ETL world, you typically summarize data. So, if I want to summarize data for 2016-02-19, I would do it at 2016-02-20 midnight GMT, which would be right after all data for 2016-02-19 becomes available. This date is available to you in both Jinja and a Python callable's context in many forms as documented here. As a note ds refers to date_string, not date start as may be confusing to some.

    0 讨论(0)
  • 2021-01-15 04:57

    The similar issue happened to me as well. I solved it by manually run airflow backfill -s start_date -e end_date DAG_NAME where start_date and end_date covers the missing execution_date, in your case, 2016-08-08. For example, airflow backfill -s 2016-08-07 -e 2016-08-09 DAG_NAME

    0 讨论(0)
提交回复
热议问题