here\'s my situation: I had a big text file that I wanted to pull certain information from. I used sed to pull all the relevant information based on regexp\'s, but each \"piece\
$ perl -0pe 's/\n,/,/g' < test.dat 92831,499,000,0644321 79217,999,000,5417178,PK91622,PK90755
Translation: Read in bulk without line separation, swap out each comma following a newline with just a comma.
Shortest code here!
sedsed -d -n ':t;/^,/!x;H;n;/^,/{x;$!bt;x;H};x;s/\n//g;p;${x;/^,/!p}' filename
This might work for you:
# sed ':a;N;s/\n,/,/;ta;P;D' test.dat | sed 's/,/\n/5;s/\(.*,\).*\n/&\1/'
92831,499,000,0644321
79217,999,000,5417178,PK91622
79217,999,000,5417178,PK90755
Explanation:
This comes in two parts:
Append the next line and then if the appended line begins with a ,
, delete the embedded new line \n
and start again. If not print upto the newline and then delete upto the new line. Repeat.
Replace the 5th ,
with a new line. Then insert the first four fields inbetween the embedded newline and the sixth field.
Without special-casing field 3, easy.
awk '
!/^,/ { if (NR > 1) print x ; x = $0 }
/^,/ { x = x OFS $0 }
END { if (NR) print x }
'
With, more complex but still not too hard.
awk '
!/^,/ { if (n && n < 3) print x ; x = $0 ; n = 1 }
/^,/ { if (++n > 2) { print x, $0 } else { x = x OFS $0 } }
END { if (n && n < 3) print x }
'
Well, guess I should have taken a closer look at using Records in awk when I was trying to figure this out last night... 10 minutes after looking at them I got it working. For anyone interested here's how I did this: In my original sed script I put an extra newline infront of the beginning of each record so there's now a blank line seperating each one. I then use the following awk command:
awk 'BEGIN {RS = ""; FS = "\n"}
{
if (NF >= 3)
for (i = 3; i <= NF; i++)
print $1,$2,$i
}'
and it works like a charm outputting exactly the way I wanted!