Removing trailing / starting newlines with sed, awk, tr, and friends

后端 未结 16 660
一个人的身影
一个人的身影 2021-01-30 21:04

I would like to remove all of the empty lines from a file, but only when they are at the end/start of a file (that is, if there are no non-empty lines before them, at the start;

16条回答
  •  小鲜肉
    小鲜肉 (楼主)
    2021-01-30 21:25

    For an efficient non-recursive version of the trailing newlines strip (including "white" characters) I've developed this sed script.

    sed -n '/^[[:space:]]*$/ !{x;/\n/{s/^\n//;p;s/.*//;};x;p;}; /^[[:space:]]*$/H'
    

    It uses the hold buffer to store all blank lines and prints them only after it finds a non-blank line. Should someone want only the newlines, it's enough to get rid of the two [[:space:]]* parts:

    sed -n '/^$/ !{x;/\n/{s/^\n//;p;s/.*//;};x;p;}; /^$/H'
    

    I've tried a simple performance comparison with the well-known recursive script

    sed -e :a -e '/^\n*$/{$d;N;};/\n$/ba'
    

    on a 3MB file with 1MB of random blank lines around a random base64 text.

    shuf -re 1 2 3 | tr -d "\n" | tr 123 " \t\n" | dd bs=1 count=1M > bigfile
    base64 > bigfile
    shuf -re 1 2 3 | tr -d "\n" | tr 123 " \t\n" | dd bs=1 count=1M >> bigfile
    

    The streaming script took roughly 0.5 second to complete, the recursive didn't end after 15 minutes. Win :)

    For completeness sake of the answer, the leading lines stripping sed script is already streaming fine. Use the most suitable for you.

    sed '/[^[:blank:]]/,$!d'
    sed '/./,$!d'
    

提交回复
热议问题