I have a bash script start.sh which looks like this:
for thing in foo bar; do
{
background_processor $thing
cleanup_on_exit $thing
} &
I think I can explain this now! I had to learn a bit about what sessions and process groups are, which I did by reading The TTY Demystified.
- Why does ssh (without -t) wait for the child processes even after start.sh exits, even though they have parent pid 1?
Because with no tty, ssh connects to stdin/stdout/stderr of the shell process via pipes (which are then inherited by the children), and the version of OpenSSH that I am using (OpenSSH_4.3p2) waits for those sockets to close before exiting. Some earlier versions of OpenSSH did not behave that way. There is a good explanation of this, with rationale, here.
Conversely, when using an interactive login (or ssh -t
), ssh and the processes are using a TTY and so there are no pipes to wait for.
I can recover the behaviour I want by redirecting the streams. This variant returns immediately: ssh user@host "start.sh < /dev/null > /dev/null 2>&1"
- Why does ssh (with -t) kill the child processes, apparently with a SIGHUP, even though that does not happen when I run them from a terminal and log out of that terminal?
Because bash is starting in non-interactive mode, which means that job control is disabled by default, and consequently the child processes are in the same process group as the parent bash process (which is the session leader). When the parent bash process exits, the kernel sends SIGHUP to its process group (which is in the foreground) as described in setpgid(2)
:
If a session has a controlling terminal, ... [and] the session leader exits, the SIGHUP signal will be sent to each process in the foreground process group of the controlling terminal.
Conversely, when using an interactive login, bash is in interactive mode which means that job control is enabled by default, and so the child processes go into a separate process group and never receive the SIGHUP when I exit.
I can recover the behaviour I want by using set -m
to enable job control in bash. If I add set -m
to start.sh
, the children are no longer killed when ssh exits.
Mysteries solved :)
I suspect (but I’m postulating) that when there is no tty, bash is passing the SIGHUP to your forked process, which is handling the signal itself, and quietly ignoring it and continuing to tie up the SSH session.
However, with a tty between you and the process, the tty driver is intercepting the SIGHUP, realises that it has lost the user, and forks itself to run without the ssh session as the parent.
Prepend any call you don't want this SIGHUP to happen to with "nohup".