I am new to Airflow. I am trying to run airflow scheduler as a daemon process, but the process does not live for long. I have configured \"LocalExecutor\" in airflow.cfg fil
You can use systemd or upstart as described here:
https://github.com/apache/incubator-airflow/tree/master/scripts/systemd https://github.com/apache/incubator-airflow/tree/master/scripts/upstart
Here are the instructions just in case if links break in the future.
The provided systemd files are tested on RedHat based systems. Copy (or link) them to /usr/lib/systemd/system and copy the airflow.conf to /etc/tmpfiles.d/ or /usr/lib/tmpfiles.d/. Copying airflow.conf ensures /run/airflow is created with the right owner and permissions (0755 airflow airflow)
You can then start the different servers by using systemctl start . Enabling services can be done by issuing
systemctl enable [service]
By default the environment configuration points to /etc/sysconfig/airflow . You can copy the "airflow" file in this directory and adjust it to your liking. Make sure to specify the SCHEDULER_RUNS variable.
With some minor changes they probably work on other systemd systems.
You can modify provided below configuration files to reflect your environment
Content of /etc/sysconfig/airflow file
# This file is the environment file for Airflow. Put this file in /etc/sysconfig/airflow per default
# configuration of the systemd unit files.
#
# AIRFLOW_CONFIG=
# AIRFLOW_HOME=
#
# required setting, 0 sets it to unlimited. Scheduler will get restart after every X runs
SCHEDULER_RUNS=5
Content of /etc/tmpfiles.d/airflow.conf or /usr/lib/tmpfiles.d/airflow.conf file
D /run/airflow 0755 airflow airflow
Content of /usr/lib/systemd/system/airflow-scheduler.service
[Unit]
Description=Airflow scheduler daemon
After=network.target postgresql.service mysql.service redis.service rabbitmq-server.service
Wants=postgresql.service mysql.service redis.service rabbitmq-server.service
[Service]
EnvironmentFile=/etc/sysconfig/airflow
User=airflow
Group=airflow
Type=simple
ExecStart=/bin/airflow scheduler -n ${SCHEDULER_RUNS}
KillMode=process
Restart=always
RestartSec=5s
[Install]
WantedBy=multi-user.target
--num-runs=5
will make scheduler run task instances 5 times. You can remove that arguments to make scheduler long running.
Ideally you should run that scheduler under supervisor, so when the process crashed / stopped, it will rerun.
I had a similar problem. My airflow scheduler did not keep running as a deamon process when I executed scheduler as deamon:
airflow scheduler -D
But the scheduler did work when I ran it normally. After I deleted the airflow-scheduler.err file and rerun the scheduler as a deamon process it started working:
rm airflow-scheduler.err
airflow scheduler -D