问题
I am running an hourly process that picks up data from one location ("origin") and moves it to another ("destination"). for the most part, the data arrives to my origin at specific time and everything works fine, but there can be delays and when that happens, the task in airflow fails and need to be manually re-run. One way to solve this is to give more time for the data to arrive, but I prefer to do that only if there is in fact a delay. Also, I wouldn't want to have a sensor that is waiting on the data for a long time, as it can cause deadlocks (preferably not to have an hourly task running for longer than 1 hour). Does airflow allow any re scheduling of a task for a given condition (failed, or no data exists), so that we don't have to manually re-run our failed tasks?
Thanks!
回答1:
Check out the following parameters for the BaseOperator (This is the parent class for all operators):
- retry_delay (timedelta) – delay between retries
- retry_exponential_backoff (bool) – allow progressive longer waits between retries by using exponential backoff algorithm on retry delay (delay will be converted into seconds)
- max_retry_delay (timedelta) – maximum delay interval between retries
Getting a good mix on these three should give you what you want.
https://incubator-airflow.readthedocs.io/en/latest/code.html
来源:https://stackoverflow.com/questions/55781118/how-to-automatically-reschedule-airflow-tasks