Why doesn't “sort file1 > file1” work?

前端 未结 7 1453
忘了有多久
忘了有多久 2020-12-03 21:23

When I am trying to sort a file and save the sorted output in itself, like this

sort file1 > file1;

the contents of the file1 is getting

相关标签:
7条回答
  • 2020-12-03 21:38

    As other people explained, the problem is that the I/O redirection is done before the sort command is executed, so the file is truncated before sort gets a chance to read it. If you think for a bit, the reason why is obvious - the shell handles the I/O redirection, and must do that before running the command.

    The sort command has 'always' (since at least Version 7 UNIX) supported a -o option to make it safe to output to one of the input files:

    sort -o file1 file1 file2 file3
    

    The trick with tee depends on timing and luck (and probably a small data file). If you had a megabyte or larger file, I expect it would be clobbered, at least in part, by the tee command. That is, if the file is large enough, the tee command would open the file for output and truncate it before sort finished reading it.

    0 讨论(0)
  • 2020-12-03 21:51

    Redirection has higher precedence. So in the first case, > file1 executes first and empties the file.

    0 讨论(0)
  • 2020-12-03 22:01

    It's unwise to depend on either of these command to work the way you expect.

    The way to modify a file in place is to write the modified version to a new file, then rename the new file to the original name:

    sort file1 > file1.tmp && mv file1.tmp file1
    

    This avoids the problem of reading the file after it's been partially modified, which is likely to mess up the results. It also makes it possible to deal gracefully with errors; if the file is N bytes long, and you only have N/2 bytes of space available on the file system, you can detect the failure creating the temporary file and not do the rename.

    Or you can rename the original file, then read it and write to a new file with the same name:

    mv file1 file1.bak && sort file1.bak > file1
    

    Some commands have options to modify files in place (for example, perl and sed both have -i options (note that the syntax of sed's -i option can vary). But these options work by creating temporary files; it's just done internally.

    0 讨论(0)
  • 2020-12-03 22:02

    It doesn't work because '>' redirection implies truncation, and to avoid keeping the whole output of sort in the memory before re-directing to the file, bash truncates and redirects output before running sort. Thus, contents of the file1 file will be truncated before sort will have a chance to read it.

    0 讨论(0)
  • 2020-12-03 22:03

    Bash open a new empty file when reads the pipe, and then calls to sort.

    In the second case, tee opens the file after sort has already read the contents.

    0 讨论(0)
  • 2020-12-03 22:04

    The first command doesn't work (sort file1 > file1), because when using the redirection operator (> or >>) shell creates/truncates file before the sort command is even invoked, since it has higher precedence.

    The second command works (sort file1 | tee file1), because sort reads lines from the file first, then writes sorted data to standard output.

    So when using any other similar command, you should avoid using redirection operator when reading and writing into the same file, but you should use relevant in-place editors for that (e.g. ex, ed, sed), for example:

    ex '+%!sort' -cwq file1
    

    or use other utils such as sponge.

    Luckily for sort there is the -o parameter which write results to the file (as suggested by @Jonathan), so the solution is straight forward: sort -o file1 file1.

    0 讨论(0)
提交回复
热议问题