I have an EC2 instance that is running airflow 1.8.0 using LocalExecutor
. Per the docs I would have expected that one of the following two commands would have raise
Documentation might be dated?
I normally start Airflow as following
airflow kerberos -D
airflow scheduler -D
airflow webserver -D
Here's airflow webeserver --help
output (from version 1.8):
-D, --daemon Daemonize instead of running in the foreground
Notice there is not boolean flag possible there. Documentation has to be fixed.
Quick note in case airflow scheduler -D
fails:
This is included in the comments, but it seems like it's worth mentioning here. When you run your airflow scheduler it will create the file $AIRFLOW_HOME/airflow-scheduler.pid
. If you try to re-run the airflow scheduler daemon process this will almost certainly produce the file $AIRFLOW_HOME/airflow-scheduler.err
which will tell you that lockfile.AlreadyLocked: /home/ubuntu/airflow/airflow-scheduler.pid is already locked
. If your scheduler daemon is indeed out of commission and you find yourself needing to restart is execute the following commands:
sudo rm $AIRFLOW_HOME airflow-scheduler.err airflow-scheduler.pid
airflow scheduler -D
This got my scheduler back on track.
About task start via systemd:
I had a problem with the PATH variable when run this way is initially empty. That is, when you write to the file /etc/sysconfig/airflow:
PATH=/home/ubuntu/bin:/home/ubuntu/.local/bin:$PATH
you literally write:
PATH=/home/ubuntu/bin:/home/ubuntu/.local/bin
Thus, the variable PATH
doesn't contain /bin
which is a bash
utility that LocalExecutor uses to run tasks.
So I do not understand why in this file you have not specified AIRFLOW_HOME
. That is, the directory in which the Airflow is looking for its configuration file.