Removing trailing / starting newlines with sed, awk, tr, and friends

后端未结

关注

 16  682

I would like to remove all of the empty lines from a file, but only when they are at the end/start of a file (that is, if there are no non-empty lines before them, at the start;

相关标签:

16条回答

慢半拍i

2021-01-30 21:23
As mentioned in another answer, tac is part of coreutils, and reverses a file. Combining the idea of doing it twice with the fact that command substitution will strip trailing new lines, we get
```
echo "$(echo "$(tac "$filename")" | tac)"
```
which doesn't depend on sed. You can use echo -n to strip the remaining trailing newline off.
0 讨论(0)
发布评论:

提交评论
- 加载中...
北恋

2021-01-30 21:25
Using bash
```
$ filecontent=$(<file)
$ echo "${filecontent/$'\n'}"
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
小鲜肉

2021-01-30 21:25
For an efficient non-recursive version of the trailing newlines strip (including "white" characters) I've developed this sed script.
```
sed -n '/^[[:space:]]*$/ !{x;/\n/{s/^\n//;p;s/.*//;};x;p;}; /^[[:space:]]*$/H'
```
It uses the hold buffer to store all blank lines and prints them only after it finds a non-blank line. Should someone want only the newlines, it's enough to get rid of the two [[:space:]]* parts:
```
sed -n '/^$/ !{x;/\n/{s/^\n//;p;s/.*//;};x;p;}; /^$/H'
```
I've tried a simple performance comparison with the well-known recursive script
```
sed -e :a -e '/^\n*$/{$d;N;};/\n$/ba'
```
on a 3MB file with 1MB of random blank lines around a random base64 text.
```
shuf -re 1 2 3 | tr -d "\n" | tr 123 " \t\n" | dd bs=1 count=1M > bigfile
base64 </dev/urandom | dd bs=1 count=1M >> bigfile
shuf -re 1 2 3 | tr -d "\n" | tr 123 " \t\n" | dd bs=1 count=1M >> bigfile
```
The streaming script took roughly 0.5 second to complete, the recursive didn't end after 15 minutes. Win :)

For completeness sake of the answer, the leading lines stripping sed script is already streaming fine. Use the most suitable for you.
```
sed '/[^[:blank:]]/,$!d'
sed '/./,$!d'
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
无人及你

2021-01-30 21:25
Here's an awk version that removes trailing blank lines (both empty lines and lines consisting of nothing but white space).

It is memory efficient; it does not read the entire file into memory.
```
awk '/^[[:space:]]*$/ {b=b $0 "\n"; next;} {printf "%s",b; b=""; print;}'
```
The b variable buffers up the blank lines; they get printed when a non-blank line is encountered. When EOF is encountered, they don't get printed. That's how it works.

If using gnu awk, [[:space:]] can be replaced with \s. (See full list of gawk-specific Regexp Operators.)

If you want to remove only those trailing lines that are empty, see @AndyMortimer's answer.
0 讨论(0)
发布评论:

提交评论
- 加载中...

隐瞒了意图╮

2021-01-30 21:32

From Useful one-line scripts for sed:

# Delete all leading blank lines at top of file (only).
sed '/./,$!d' file

# Delete all trailing blank lines at end of file (only).
sed -e :a -e '/^\n*$/{$d;N;};/\n$/ba' file

Therefore, to remove both leading and trailing blank lines from a file, you can combine the above commands into:

sed -e :a -e '/./,$!d;/^\n*$/{$d;N;};/\n$/ba' file

0 讨论(0)

眼角桃花

2021-01-30 21:34
A bash solution.

Note: Only useful if the file is small enough to be read into memory at once.
```
[[ $(<file) =~ ^$'\n'*(.*)$ ]] && echo "${BASH_REMATCH[1]}"
```
- $(<file) reads the entire file and trims trailing newlines, because command substitution ($(....)) implicitly does that.
- =~ is bash's regular-expression matching operator, and =~ ^$'\n'*(.*)$ optionally matches any leading newlines (greedily), and captures whatever comes after. Note the potentially confusing $'\n', which inserts a literal newline using ANSI C quoting, because escape sequence \n is not supported.
- Note that this particular regex always matches, so the command after && is always executed.
- Special array variable BASH_REMATCH rematch contains the results of the most recent regex match, and array element [1] contains what the (first and only) parenthesized subexpression (capture group) captured, which is the input string with any leading newlines stripped. The net effect is that ${BASH_REMATCH[1]} contains the input file content with both leading and trailing newlines stripped.
- Note that printing with echo adds a single trailing newline. If you want to avoid that, use echo -n instead (or use the more portable printf '%s').
0 讨论(0)
发布评论:

提交评论
- 加载中...