问题
I need to get last two successful execution dates of Airflow job to use in my current run. Example : Execution date Job status 2020-05-03 success 2020-05-04 fail 2020-05-05 success
Question : When I run my job on May 6th I should get values of May 3rd and 5th into variables. Is it possible?
回答1:
You can leverage SQLAlchemy
magic for retrieving execution_date
s against last 'n' successfull runs
from pendulum import Pendulum
from typing import List, Dict, Any, Optional
from airflow.utils.state import State
from airflow.settings import Session
from airflow.models.taskinstance import TaskInstance
@provide_session
def last_n_execution_dates(dag_id: str,
task_id: str,
n: int,
session: Optional[Session]) -> List[Pendulum]:
task_instances: TaskInstance = (session
.query(TaskInstance)
.filter(TaskInstance.dag_id == dag_id,
TaskInstance.task_id == task_id,
TaskInstance.state == State.SUCCESS)
.order_by(TaskInstance.execution_date.desc())
.limit(n)
.all())
execution_dates: List[Pendulum] = list(map(lambda ti: ti.execution_date, task_instances))
return execution_dates
Note that the snippet is for reference purpose only and is untested
I've referred to tree() method of views.py for coming up with this script.
Alternatively, you can fire this SQL query to the Airflow's meta-db to retrieve last n execution dates with successful runs
SELECT execution_date
FROM task_instance
WHERE dag_id = 'my_dag_id'
AND task_id = 'my_task_id'
AND state = 'success'
ORDER BY execution_date DESC
LIMIT n
来源:https://stackoverflow.com/questions/61671646/how-to-get-last-two-successful-execution-dates-of-airflow-job