问题
This question helped me understand the difference between redirection and piping, but the examples focus on redirecting STDOUT (echo foo > bar.txt
) and piping STDIN (ls | grep foo
).
It would seem to me that any command that could be written my_command < file.txt
could also be written cat file.txt | my_command
. In what situations are STDIN redirection necessary?
Apart from the fact that using cat
spawns an extra process and is less efficient than redirecting STDIN, are there situations in which you have to use the STDIN redirection? Put another way, is there ever a reason to pipe the output of cat
to another command?
回答1:
What's the difference between my_command < file.txt
and cat file.txt | my_command
?
my_command < file.txt
The redirection symbol can also be written as 0<
as this redirects file descriptor 0 (stdin
) to connect to file.txt
instead of the current setting, which is probably the terminal. If my_command
is a shell built-in then there are NO child processes created, otherwise there is one.
cat file.txt | my_command
This redirects file descriptor 1 (stdout
) of the command on the left to the input stream of an anonymous pipe, and file descriptor 0 (stdin
) of the command on the right to the output stream of the anonymous pipe.
We see at once that there is a child process, since cat
is not a shell built-in. However in bash
even if my_command
is a shell builtin it is still run in a child process. Therefore we have TWO child processes.
So the pipe, in theory, is less efficient. Whether that difference is significant depends on many factors, including the definition of "significant". The time when a pipe is preferable is this alternative:
command1 > file.txt
command2 < file.txt
Here it is likely that
command1 | command2
is more efficient, remembering that, in practice, we will probably need a third child process in rm file.txt
.
However, there are limitations to pipes. They are not seekable (random access, see man 2 lseek
) and they cannot be memory mapped (see man 2 mmap
). Some applications map files to virtual memory, but it would be unusual to do that to stdin
or stdout
. Memory mapping in particular is not possible on a pipe (whether anonymous or named) because a range of virtual addresses has to be reserved and for that a size is required.
Edit:
As mentioned by @JohnKugelman, a common error and source of many SO questions is the associated issue with a child process and redirection:
Take a file file.txt
with 99 lines:
i=0
cat file.txt|while read
do
(( i = i+1 ))
done
echo "$i"
What gets displayed? The answer is 0
. Why? Because the count i = i + 1
is done in a subshell which, in bash
, is a child process and does not change i
in the parent (note: this does not apply to korn shell, ksh
).
while read
do
(( i = i+1 ))
done < file.txt
echo "$i"
This displays the correct count because no child processes are involved.
回答2:
You can of course replace any use of input redirection with a pipe that reads from cat
, but it is inefficient to do so, as you are spawning a new process to do something the shell can already do by itself. However, not every instance of cat ... | my_command
can be replaced with my_command < ...
, namely when cat
is doing its intended job of concatenating two (or more) files, it is perfectly reasonable to pipe its output to another command.
cat file1.txt file2.txt | my_command
来源:https://stackoverflow.com/questions/48446896/do-cat-foo-txt-my-cmd-and-my-cmd-foo-txt-accomplish-the-same-thing