问题
I recently ran into this nasty error where Airflow's apply_defaults decorator is throwing following stack-trace (my **kwargs
do contain job_flow_id
)
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/mnt/airflow/dags/zanalytics-airflow/src/main/mysql_import/dags/mysql_import_dag.py", line 23, in <module>
sync_dag_builder.build_sync_dag()
File "/mnt/airflow/dags/zanalytics-airflow/src/main/mysql_import/dags/builders/sync_dag_builders/emr_sync_dag_builder.py", line 26, in build_sync_dag
create_emr_task, terminate_emr_task = self._create_job_flow_tasks()
File "/mnt/airflow/dags/zanalytics-airflow/src/main/mysql_import/dags/builders/sync_dag_builders/emr_sync_dag_builder.py", line 44, in _create_job_flow_tasks
task_id=GlobalConstants.EMR_TERMINATE_STEP)
File "/home/hadoop/.pyenv/versions/3.6.6/lib/python3.6/site-packages/airflow/utils/decorators.py", line 98, in wrapper
result = func(*args, **kwargs)
File "/mnt/airflow/dags/zanalytics-airflow/src/main/aws/operators/emr_terminate_ancestor_job_flows_operator.py", line 31, in __init__
EmrTerminateJobFlowOperator.__init__(self, *args, **kwargs)
File "/home/hadoop/.pyenv/versions/3.6.6/lib/python3.6/site-packages/airflow/utils/decorators.py", line 98, in wrapper
result = func(*args, **kwargs)
File "/home/hadoop/.pyenv/versions/3.6.6/lib/python3.6/site-packages/airflow/contrib/operators/emr_terminate_job_flow_operator.py", line 44, in __init__
super(EmrTerminateJobFlowOperator, self).__init__(*args, **kwargs)
File "/home/hadoop/.pyenv/versions/3.6.6/lib/python3.6/site-packages/airflow/utils/decorators.py", line 94, in wrapper
raise AirflowException(msg)
airflow.exceptions.AirflowException: Argument ['job_flow_id'] is required
The disturbing parts are
- Exception is presently originating from the
__init__
of the built-in EmrTerminateJobFlowOperator - Earlier it was coming from EmrCreateJobFlowOperator, even though that doesn't take in a
job_flow_id
param; but it has gone since
Looking into decorators.py
, I felt that sig_cache might be messing-up some things. In fact, from the commit that introduced it, I cannot figure out how function-signature caching is working at all (at least it isn't working in this way)?
I've tried deleting all __pycache__
and restarting scheduler
, webserver
without luck (I'm running them in separate Linux screens)
- What could be causing the error?
- How does
sig_cache
work and does it need to be cleared forcefully under any circumstances? If so, how to clear it?
Environment
Python 3.6.6
Airflow 1.10.2
LocalExecutor
来源:https://stackoverflow.com/questions/54529660/airflow-apply-defaults-decorator-reports-argument-is-required