is there an alternative to \"tee\" which captures STDOUT/STDERR of the command being executed and exits with the same exit status as the processed command. Something as followin
This is what I consider to be the best pure-Bourne-shell solution to use as the base upon which you could build your "eet":
# You want to pipe command1 through command2:
exec 4>&1
exitstatus=`{ { command1; printf $? 1>&3; } | command2 1>&4; } 3>&1`
# $exitstatus now has command1's exit status.
I think this is best explained from the inside out – command1 will execute and print its regular output on stdout (file descriptor 1), then once it's done, printf will execute and print command1's exit code on its stdout, but that stdout is redirected to file descriptor 3.
While command1 is running, its stdout is being piped to command2 (printf's output never makes it to command2 because we send it to file descriptor 3 instead of 1, which is what the pipe reads). Then we redirect command2's output to file descriptor 4, so that it also stays out of file descriptor 1 – because we want file descriptor 1 free for a little bit later, because we will bring the printf output on file descriptor 3 back down into file descriptor 1 – because that's what the command substitution (the backticks), will capture and that's what will get placed into the variable.
The final bit of magic is that first exec 4>&1
we did as a separate command – it opens file descriptor 4 as a copy of the external shell's stdout. Command substitution will capture whatever is written on standard out from the perspective of the commands inside it – but, since command2's output is going to file descriptor 4 as far as the command substitution is concerned, the command substitution doesn't capture it – however, once it gets "out" of the command substitution, it is effectively still going to the script's overall file descriptor 1.
(The exec 4>&1
has to be a separate command because many common shells don't like it when you try to write to a file descriptor inside a command substitution, that is opened in the "external" command that is using the substitution. So this is the simplest portable way to do it.)
You can look at it in a less technical and more playful way, as if the outputs of the commands are leapfrogging each other: command1 pipes to command2, then the printf's output jumps over command 2 so that command2 doesn't catch it, and then command 2's output jumps over and out of the command substitution just as printf lands just in time to get captured by the substitution so that it ends up in the variable, and command2's output goes on its merry way being written to the standard output, just as in a normal pipe.
Also, as I understand it, $?
will still contain the return code of the second command in the pipe, because variable assignments, command substitutions, and compound commands are all effectively transparent to the return code of the command inside them, so the return status of command2 should get propagated out.
A caveat is that it's possible that command1 will at some point end up using file descriptors 3 or 4, or that command2 or any of the later commands will use file descriptor 4, so to be more robust, you would do:
exec 4>&1
exitstatus=`{ { command1 3>&-; printf $? 1>&3; } 4>&- | command2 1>&4; } 3>&1`
exec 4>&-
Note that I use compound commands in my example, but subshells (using ( )
instead of { }
will also work, though may perhaps be less efficient.)
Commands inherit file descriptors from the process that launches them, so the entire second line will inherit file descriptor four, and the compound command followed by 3>&1
will inherit the file descriptor three. So the 4>&-
makes sure that the inner compound command will not inherit file descriptor four, and the 3>&-
will not inherit file descriptor three, so command1 gets a 'cleaner', more standard environment. You could also move the inner 4>&-
next to the 3>&-
, but I figure why not just limit its scope as much as possible.
I'm not sure how often things use file descriptor three and four directly – I think most of the time programs use syscalls that return not-used-at-the-moment file descriptors, but sometimes code writes to file descriptor 3 directly, I guess (I could imagine a program checking a file descriptor to see if it's open, and using it if it is, or behaving differently accordingly if it's not). So the latter is probably best to keep in mind and use for general-purpose cases.
---OUTDATED CONTENT BELOW THIS LINE---
For historical reasons, here is my original, not-portable-to-all-shells answer:
[EDIT] My bad, this does not work with bash because bash needs extra coddling when fiddling with file descriptors, I will update this as soon as I can. [/EDIT]
Pure Bourne shell solution:
exitstatus=`{ 3>&- command1; } 1>&3; printf $?` 3>&1 | command2
# $exitstatus now has command1's exit status.
This is the base upon which you could build your "eet". Slap in some commandline argument parsing and all that, turn command2 into "tee" with the relevant options, etc.
The VERY detailed explanation is as follows:
At the top level, the statement is just a pipe between two commands:
commandA | command2
commandA in turn breaks down to a single command with a redirection of file descriptor 3 to file descriptor 1 (stdout):
commandB 3>&1
This means that the shell will be expecting commandB to write something to file descriptor 3 - if file descriptor 3 is never opened, it would be an error. It also means that command2 will get whatever commandB outputs on both file descriptors 1 (stdout) and 3.
commandB in turn is a variable assignment using command substitution:
VAR_FOO=`commandC`
We know that variable assignments do not print anything on any file descriptors (and commandC's stdout is captured for the substitution), so we know that commandB as a whole will not output anything on stdout. command2 will thus only see what commandC writes to file descriptor 3.
And commandC is two commands, where the second command prints the exit status of the first:
commandD ; printf $?
So by now we know the variable assignment in the last step will contain the exit status of commandD.
Now, commandD decomposes to another basic redirection, of a comman's stdout to file descriptor 3:
commandE 1>&3
So now we know that the thing writing to file descriptor 3, and thus ultimately to command2, is commandE's stdout.
Finally: commandE is a "compound command" (you can also use a subshell here, but it's not as efficient), wrapping around another less commonly seen type of "redirection":
{ 3>&- command1; }
(That 3>&-
is a bit tricky so we'll come back to it at the end.) So compound commands make that semicolon mandatory when the last command and the last brace are on the same line, that's why that is there. So we know compound commands return the exit code of their last command, and they inherit file descriptors like everything else, so we now know that command1's stdout flows out of the compound command, redirects to file descriptor 3 to avoid being caught by the command substitution, meanwhile the command substitution catches the remaining stdout of the printf which echoes out the exit status of command1 once it is done.
And now for the tricky bit: 3>&-
says "close file descriptor 3". You may think, "why are you closing it when you just redirected command1's output to it?" Well, if you look carefully, you'll see that the close effects only command1 inside the compound command (inside the curly braces) specifically, while the redirection effects the entire compound command.
So here is what happens: by the time the individual commands of the compound command run, the shell opened file descriptor 3. Processes inherit file descriptors, so command1, by default, will run with file descriptor 3 open and pointing to the same place too. This is bad because occasionally, programs actually expect specific file descriptors to mean special things - they may behave differently when launched with file descriptor 3 open. The most robust solution is to just close file descriptor 3 (or whichever number you use) for just command1, so it runs as it would as if file descriptor 3 was never opened.