How to retry Ansible task that may fail?

前端 未结 4 1766
[愿得一人]
[愿得一人] 2020-12-29 18:26

In my Ansible play I am restarting database then trying to do some operations on it. Restart command returns as soon as restart is started, not when db is up. Next command t

相关标签:
4条回答
  • 2020-12-29 19:01

    I don't understand your claim that the "first command execution fails whole play". It wouldn't make sense if Ansible behaved this way.

    The following task:

    - command: /usr/bin/false
      retries: 3
      delay: 3
      register: result
      until: result.rc == 0
    

    produces:

    TASK [command] ******************************************************************************************
    FAILED - RETRYING: command (3 retries left).
    FAILED - RETRYING: command (2 retries left).
    FAILED - RETRYING: command (1 retries left).
    fatal: [localhost]: FAILED! => {"attempts": 3, "changed": true, "cmd": ["/usr/bin/false"], "delta": "0:00:00.003883", "end": "2017-05-23 21:39:51.669623", "failed": true, "rc": 1, "start": "2017-05-23 21:39:51.665740", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
    

    which seems to be exactly what you want.

    0 讨论(0)
  • 2020-12-29 19:04

    Consider using wait_for module. It waits for a condition before continuing, for example for a port to become open or closed, for a file to exist or not, or for some content in a file.

    Without seeing the rest of your playbook, consider the following example:

    - name: Wait for db server to restart
      local_action:
        wait_for:
          host=192.168.50.4
          port=3306
          delay=1
          timeout=300
    

    You can also adapt it as a handler and obviously change this snippet to suit your use-case.

    0 讨论(0)
  • 2020-12-29 19:16

    Not sure if this is Ansible tower specific, but I am using:

    - command: /usr/bin/false
      register: result
      retries: 3
      delay: 10
      until: result is not failed
    
    0 讨论(0)
  • 2020-12-29 19:24

    For the following task:

    - hosts: all
    become: yes
    tasks:
    - name: create the 'myusername' user
      user: name=myusername append=yes state=present createhome=yes shell=/bin/bash
    

    I was not sure weather the remote was ready yet (because this was a newly spinned node). So I had to try those retries and delays stuff. Unfortunately with no luck. For now I ended up creating a wrapper in my bash script to achieve the needed behavior.

    #!/bin/bash
    
    STATUS_CODE=1
    TRY=1
    while [ "$STATUS_CODE" -ge 1 ]
    do
      if [ $TRY -gt 5 ];
      then
        echo Retried to connect to node 5 times and failed. Exiting
        exit 1
      fi
    
      ansible-playbook -i $HOSTS_FILE user.yml
      STATUS_CODE=$?
      TRY=$(( $TRY + 1 ))
    
      if [ $STATUS_CODE -ge 1 ]
      then
        echo Retry to connect to node in 5 seconds
        sleep 5
      fi
    done
    

    Still in hopes to make it a cleaner way using ansible-playbook yml. Anyone got suggestions on this?

    0 讨论(0)
提交回复
热议问题