How to consider daylight savings time when using cron schedule in Airflow

后端 未结 1 769
孤城傲影
孤城傲影 2020-12-18 10:31

In Airflow, I\'d like a job to run at specific time each day in a non-UTC timezone. How can I go about scheduling this?

The problem is that once daylight savings ti

相关标签:
1条回答
  • 2020-12-18 10:48

    Starting with Airflow 1.10, time-zone aware DAGs can be defined using time-zone aware datetime objects to specify start_date. For Airflow to schedule DAG runs always at the same time (regardless of a possible daylight-saving-time switch), use cron expressions to specify schedule_interval. To make Airflow schedule DAG runs with fixed intervals (regardless of a possible daylight-saving-time switch), use datetime.timedelta() to specify schedule_interval.

    For example, consider the following code that, first, uses a cron expression to schedule two consecutive DAG runs, and then uses a fixed interval to do the same.

    import pendulum
    from airflow import DAG
    from datetime import datetime, timedelta
    
    START_DATE = datetime(
        year=2019,
        month=10,
        day=25,
        hour=8,
        minute=0,
        tzinfo=pendulum.timezone('Europe/Kiev'),
    )
    
    
    def gen_execution_dates(start_date, schedule_interval):
        dag = DAG(
            dag_id='id', start_date=start_date, schedule_interval=schedule_interval
        )
        execution_date = dag.start_date
        for i in range(1, 3):
            execution_date = dag.following_schedule(execution_date)
            print(
                f'[Run {i}: Execution Date for "{schedule_interval}"]:',
                dag.timezone.convert(execution_date),
            )
    
    
    gen_execution_dates(START_DATE, '0 8 * * *')
    gen_execution_dates(START_DATE, timedelta(days=1))
    

    Running the code produces the following output:

    [Run 1: Execution Date for "0 8 * * *"]: 2019-10-26 08:00:00+03:00
    [Run 2: Execution Date for "0 8 * * *"]: 2019-10-27 08:00:00+02:00
    [Run 1: Execution Date for "1 day, 0:00:00"]: 2019-10-26 08:00:00+03:00
    [Run 2: Execution Date for "1 day, 0:00:00"]: 2019-10-27 07:00:00+02:00
    

    For the zone [Europe/Kiev], the daylight saving time of 2019 ends on 2019-10-27 at 03:00:00+03:00. That is, between Run 1 and Run 2 in our example.

    The first two output lines show that for the DAG runs scheduled with a cron expression the first run and second run are both scheduled for 08:00 (although, in different timezones: Eastern European Summer Time (EEST) and Eastern European Time (EET) respectively).

    The last two output lines show that for the DAG runs scheduled with a fixed interval the first run is scheduled for 08:00 (EEST), and the second run is scheduled exactly 1 day (24 hours) later, which is at 07:00 (EET) due to the daylight-saving-time switch.

    The following figure illustrates the example:

    0 讨论(0)
提交回复
热议问题