I have a some Ansible tasks that perform unfortunately long operations - things like running an synchronization operation with an S3 folder. It\'s not always clear if they\'re
Ansible has since implemented the following:
---
# Requires ansible 1.8+
- name: 'YUM - async task'
yum:
name: docker-io
state: installed
async: 1000
poll: 0
register: yum_sleeper
- name: 'YUM - check on async task'
async_status:
jid: "{{ yum_sleeper.ansible_job_id }}"
register: job_result
until: job_result.finished
retries: 30
For further information, see the official documentation on the topic (make sure you're selecting your version of Ansible).
There's a couple of things you can do, but as you have rightly pointed out, Ansible in its current form doesn't really offer a good solution.
Official-ish solutions:
One idea is to mark the task as async and poll it. Obviously this is only suitable if it is capable of running in such a manner without causing failure elsewhere in your playbook. The async docs are here and here's an example lifted from them:
- hosts: all
remote_user: root
tasks:
- name: simulate long running op (15 sec), wait for up to 45 sec, poll every 5 sec
command: /bin/sleep 15
async: 45
poll: 5
This can at least give you a 'ping' to know that the task isn't hanging.
The only other officially endorsed method would be Ansible Tower, which has progress bars for tasks but isn't free.
Hacky-ish solutions:
Beyond the above, you're pretty much going to have to roll your own. Your specific example of synching an S3 bucket could be monitored fairly easily with a script periodically calling the AWS CLI and counting the number of items in a bucket, but that's hardly a good, generic solution.
The only thing I could imagine being somewhat effective would be watching the incoming ssh session from one of your nodes.
To do that you could configure the ansible user on that machine to connect via screen and actively watch it. Alternatively perhaps using the log_output
option in the sudoers entry for that user, allowing you to tail the file. Details of log_output can be found on the sudoers man page
I came across this problem today on OSX, where I was running a docker shell command which took a long time to build and there was no output whilst it built. It was very frustrating to not understand whether the command had hung or was just progressing slowly.
I decided to pipe the output (and error) of the shell command to a port, which could then be listened to via netcat in a separate terminal.
myplaybook.yml
- name: run some long-running task and pipe to a port
shell: myLongRunningApp > /dev/tcp/localhost/4000 2>&1
And in a separate terminal window:
$ nc -lk 4000
Output from my
long
running
app will appear here
Note that I pipe the error output to the same port; I could as easily pipe to a different port.
Also, I ended up setting a variable called nc_port
which will allow for changing the port in case that port is in use. The ansible task then looks like:
shell: myLongRunningApp > /dev/tcp/localhost/{{nc_port}} 2>&1
Note that the command myLongRunningApp
is being executed on localhost (i.e. that's the host set in the inventory) which is why I listen to localhost with nc
.
If you're on Linux you may use systemd-run
to create a transient unit and inspect the output with journalctl
, like:
sudo systemd-run --unit foo \
bash -c 'for i in {0..10}; do
echo "$((i * 10))%"; sleep 1;
done;
echo "Complete"'
And in another session
sudo journalctl -xf --unit foo
It would output something like:
Apr 07 02:10:34 localhost.localdomain systemd[1]: Started /bin/bash -c for i in {0..10}; do echo "$((i * 10))%"; sleep 1; done; echo "Complete".
-- Subject: Unit foo.service has finished start-up
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit foo.service has finished starting up.
--
-- The start-up result is done.
Apr 07 02:10:34 localhost.localdomain bash[10083]: 0%
Apr 07 02:10:35 localhost.localdomain bash[10083]: 10%
Apr 07 02:10:36 localhost.localdomain bash[10083]: 20%
Apr 07 02:10:37 localhost.localdomain bash[10083]: 30%
Apr 07 02:10:38 localhost.localdomain bash[10083]: 40%
Apr 07 02:10:39 localhost.localdomain bash[10083]: 50%
Apr 07 02:10:40 localhost.localdomain bash[10083]: 60%
Apr 07 02:10:41 localhost.localdomain bash[10083]: 70%
Apr 07 02:10:42 localhost.localdomain bash[10083]: 80%
Apr 07 02:10:43 localhost.localdomain bash[10083]: 90%
Apr 07 02:10:44 localhost.localdomain bash[10083]: 100%
Apr 07 02:10:45 localhost.localdomain bash[10083]: Complete