Yesterday it was suggested to me that using command substitution in bash causes an unnecessary subshell to be spawned. The advice was specific to this use case:
Update and caveat:
This answer has a troubled past in that I confidently claimed things that turned out not to be true. I believe it has value in its current form, but please help me eliminate other inaccuracies (or convince me that it should be deleted altogether).
I've substantially revised - and mostly gutted - this answer after @kojiro pointed out that my testing methods were flawed (I originally used ps
to look for child processes, but that's too slow to always detect them); a new testing method is described below.
I originally claimed that not all bash subshells run in their own child process, but that turns out not to be true.
As @kojiro states in his answer, some shells - other than bash - DO sometimes avoid creation of child processes for subshells, so, generally speaking in the world of shells, one should not assume that a subshell implies a child process.
As for the OP's cases in bash (assumes that command{n}
instances are simple commands):
# Case #1
command1 # NO subshell
var=$(command1) # 1 subshell (command substitution)
# Case #2
command1 | command2 # 2 subshells (1 for each pipeline segment)
var=$(command1 | command2) # 3 subshells: + 1 for command subst.
# Case #3
command1 | command2 ; var=$? # 2 subshells (due to the pipeline)
var=$(command1 | command2 ; echo $?) # 3 subshells: + 1 for command subst.;
# note that the extra command doesn't add
# one
It looks like using command substitution ($(...)
) always adds an extra subshell in bash - as does enclosing any command in (...)
.
I believe, but am not certain these results are correct; here's how I tested (bash 3.2.51 on OS X 10.9.1) - please tell me if this approach is flawed:
fork()
calls in the 1st with sudo dtruss -t fork -f -p {pidOfShell1}
(the -f
is necessary to also trace fork()
calls "transitively", i.e. to include those created by subshells themselves).Used only the builtin :
(no-op) in the test commands (to avoid muddling the picture with additional fork()
calls for external executables); specifically:
:
$(:)
: | :
$(: | :)
: | :; :
$(: | :; :)
Only counted those dtruss
output lines that contained a non-zero PID (as each child process also reports the fork()
call that created it, but with PID 0).
fork()
.Below is what I still believe to be correct from my original post: when bash creates subshells.
bash creates subshells in the following situations:
(...)
)
[[ ... ]]
, where parentheses are only used for logical grouping.|
), including the first one
bash 4.2+
has shell option lastpipe
(OFF by default), which causes the last pipeline segment NOT to run in a subshell.for command substitution ($(...)
)
for process substitution (<(...)
)
exec
(<(exec ...)
). &
)Combining these constructs will result in more than one subshell.
In Bash, a subshell always executes in a new process space. You can verify this fairly trivially in Bash 4, which has the $BASHPID
and $$
environment variables:
in practice:
$ type echo
echo is a shell builtin
$ echo $$-$BASHPID
4671-4671
$ ( echo $$-$BASHPID )
4671-4929
$ echo $( echo $$-$BASHPID )
4671-4930
$ echo $$-$BASHPID | { read; echo $REPLY:$$-$BASHPID; }
4671-5086:4671-5087
$ var=$(echo $$-$BASHPID ); echo $var
4671-5006
About the only case where the shell can elide an extra subshell is when you pipe to an explicit subshell:
$ echo $$-$BASHPID | ( read; echo $REPLY:$$-$BASHPID; )
4671-5118:4671-5119
Here, the subshell implied by the pipe is explicitly applied, but not duplicated.
This varies from some other shells that try very hard to avoid fork-ing. Therefore, while I feel the argument made in js-shell-parse
misleading, it is true that not all shells always fork
for all subshells.