I have a file that contains:
something
something else
something else again
I need a bash command, sed/grep w.e that will produce the fo
A solution with awk
, which replaces several blank lines with a single blank line:
awk 'BEGIN{bl=0}/^$/{bl++;if(bl==1)print;else next}/^..*$/{bl=0;print}' myfile
Use awk:
awk '{ /^\s*$/?b++:b=0; if (b<=1) print }' file
Breakdown:
/^\s*$/?b++:b=0
- ? : the ternary operator
- /^\s*$/ matches a blank line
- b variable that counts consecutive blank lines (b++).
however, if the current line is non-blank, b is reset to 0.
if (b<=1) print
print if the current line is non-blank (b==0)
or if there is only one blank line (b==1).
By adjusting the regex, you can generalize it to other scenarios like squeezing multiple blank lines (">") in email: https://stackoverflow.com/a/59189823/12483961
Usually, if I find that sed can't do something I need, I turn to awk:
awk '
BEGIN {
blank = 0;
}
/^[[:blank:]]*$/ {
if (!blank) {
print;
}
blank = 1;
next;
}
{
print;
blank = 0;
}' file
Pipelining it to |uniq may be solution (if other than empty lines don't duplicate)
I take it that you'll probably want to remove lines that only have whitespace.
That can be done with:
sed /^[:space:]*$/d FILE
If someone want use perl
perl -00pe0 < file
will do the same, as cat -s
:)