Removing trailing / starting newlines with sed, awk, tr, and friends

后端未结

关注

 16  660

一个人的身影 2021-01-30 21:04

I would like to remove all of the empty lines from a file, but only when they are at the end/start of a file (that is, if there are no non-empty lines before them, at the start;

16条回答

小鲜肉 (楼主)

2021-01-30 21:25
For an efficient non-recursive version of the trailing newlines strip (including "white" characters) I've developed this sed script.
```
sed -n '/^[[:space:]]*$/ !{x;/\n/{s/^\n//;p;s/.*//;};x;p;}; /^[[:space:]]*$/H'
```
It uses the hold buffer to store all blank lines and prints them only after it finds a non-blank line. Should someone want only the newlines, it's enough to get rid of the two [[:space:]]* parts:
```
sed -n '/^$/ !{x;/\n/{s/^\n//;p;s/.*//;};x;p;}; /^$/H'
```
I've tried a simple performance comparison with the well-known recursive script
```
sed -e :a -e '/^\n*$/{$d;N;};/\n$/ba'
```
on a 3MB file with 1MB of random blank lines around a random base64 text.
```
shuf -re 1 2 3 | tr -d "\n" | tr 123 " \t\n" | dd bs=1 count=1M > bigfile
base64 > bigfile
shuf -re 1 2 3 | tr -d "\n" | tr 123 " \t\n" | dd bs=1 count=1M >> bigfile
```
The streaming script took roughly 0.5 second to complete, the recursive didn't end after 15 minutes. Win :)

For completeness sake of the answer, the leading lines stripping sed script is already streaming fine. Use the most suitable for you.
```
sed '/[^[:blank:]]/,$!d'
sed '/./,$!d'
```
0 讨论(0)

查看其它16个回答
发布评论:

提交评论
- 加载中...