I am using airflow for my data pipeline project. I have configured my project in airflow and start the airflow server as a backend process using following command
Can you check $AIRFLOW_HOME/airflow-webserver.pid
for the process id of your webserver daemon?
Then pass it a kill signal to kill it
cat $AIRFLOW_HOME/airflow-webserver.pid | xargs kill -9
Then just run
airflow webserver -p 8080 -D True
to restart the daemon
As the question was related to webserver
, this is something that worked in my case:
systemctl restart airflow-webserver
I advice running airflow in a robust way, with auto-recovery with systemd
so you can do:
- to start systemctl start airflow
- to stop systemctl stop airflow
- to restart systemctl restart airflow
For this you'll need a systemd 'unit' file.
As a (working) example you can use the following:
put it in /lib/systemd/system/airflow.service
[Unit]
Description=Airflow webserver daemon
After=network.target postgresql.service mysql.service redis.service rabbitmq-server.service
Wants=postgresql.service mysql.service redis.service rabbitmq-server.service
[Service]
PIDFile=/run/airflow/webserver.pid
EnvironmentFile=/home/airflow/airflow.env
User=airflow
Group=airflow
Type=simple
ExecStart=/bin/bash -c 'export AIRFLOW_HOME=/home/airflow ; airflow webserver --pid /run/airflow/webserver.pid'
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s TERM $MAINPID
Restart=on-failure
RestartSec=42s
PrivateTmp=true
[Install]
WantedBy=multi-user.target
P.S: change AIRFLOW_HOME to where your airflow folder with the config
The recommended approach is to create and enable the airflow webserver as a service. If you named the webserver as 'airflow-webserver', run the following command to restart the service:
systemctl restart airflow-webserver
You can use a ready-made AMI (namely, LightningFLow) from AWS Marketplace which provides Airflow services (webserver, scheduler, worker) which are enabled at startup.
Note: LightningFlow comes pre-integrated with all required libraries, Livy, custom operators, and local Spark cluster.
Link for AWS Marketplace: https://aws.amazon.com/marketplace/pp/Lightning-Analytics-Inc-LightningFlow-Integrated-o/B084BSD66V
Airflow uses gunicorn as it's HTTP server, so you can send it standard POSIX-style signals. A signal commonly used by daemons to restart is HUP
.
You'll need to locate the pid file for the airflow webserver daemon in order to get the right process id to send the signal to. This file could be in $AIRFLOW_HOME
or also /var/run
, which is where you'll find a lot of pids.
Assuming the pid file is in /var/run
, you could run the command:
cat /var/run/airflow-webserver.pid | xargs kill -HUP
gunicorn uses a preforking model, so it has master and worker processes. The HUP
signal is sent to the master process, which performs these actions:
HUP: Reload the configuration, start the new worker processes with a new configuration and gracefully shutdown older workers. If the application is not preloaded (using the preload_app option), Gunicorn will also load the new version of it.
More information in the gunicorn signal handling docs.
This is mostly an expanded version of captaincapsaicin's answer, but using HUP
(SIGHUP) instead of KILL
(SIGKILL) to reload the process instead of actually killing it and restarting it.
This worked for me (multiple times! :D )
find the process id: (assuming 8080 is the port)
lsof -i tcp:8080
kill it
kill <pid>