How to insert two lines for every data frame using awk?

随声附和 提交于 2019-12-11 16:16:23

问题


I have repeating data as follows

....
 4 4 4 66 79 169 150 0  40928  40938  40923  40921  40789  40000  40498
 5 4 3 16 22 247 0  40168  40911  40944  40205  40000  40562
 6 4 4 17 154 93 309 0  40930  40919  40903  40917  40852  40000  40419
 7 3 2 233 311 0  40936  40932  40874  40000  40807
....

This data is made up of 115 data blocks, and each data block have 4000 lines like that format. Here, I hope to put two new lines (number of line per data block = 4000 and empty line) at the begining of each data blocks, so it looks

4000

 1 4 4 244 263 704 952 0  40936  40930  40934  40921  40820  40000  40570
 2 4 4 215 172 305 33 0  40945  40942  40937  40580  40687  40000  40410
 3 4 4 344 279 377 1945 0  40933  40915  40907  40921  40839  40000  40437
 4 4 4 66 79 169 150 0  40928  40938  40923  40921  40789  40000  40498
...
 3999 2 2 4079 4081 0  40873  40873  40746  40000  40634
 4000 1 1 4080 0  40873  40923  40000  40345
4000

 1 4 4 244 263 704 952 0  40936  40930  40934  40921  40820  40000  40570
 2 4 4 215 172 305 33 0  40945  40942  40937  40580  40687  40000  40410
 3 4 4 344 279 377 1945 0  40933  40915  40907  40921  40839  40000  40437
 4 4 4 66 79 169 150 0  40928  40938  40923  40921  40789  40000  40498
... 

Can I do this with awk or any other unix command?


回答1:


A simple one liner using awk can do the purpose.

awk 'NR%4000==1{print "4000\n"} {print$0}' file

what it does.

print $0 prints every line. NR%4000==1 selects the 4000th line. When it occures it prints a 4000 and a newline \n, that is two new lines.

NR Number of records, which is effectivly number of lines reads so far.

simple test.

inserts 4000 at 5th line

awk 'NR%5==1{print "4000\n"} {print$0}'

output:

4000

1
2
3
4
5
4000

6
7
8
9
10
4000

11
12
13
14
15
4000

16
17
18
19
20
4000



回答2:


My solution is more general, since the blocks can be of non-equal lenght as long as you restart the 1st field counter to denote the beginning of a new block

% cat mark_blocks
$1<count { print count; print "";
           for(i=1;i<=count;i++) print l[i]; }
# executed for each line
         { l[$1] = $0; count=$1}
END      { print count; print "";
           for(i=1;i<=count;i++) print l[i]; }
% awk -f mark_blocks your_data > marked_data
% 

The working is simple, awk accumulates lines in memory and it prints the header lines and the accumulated data when it reaches a new block or EOF.

The (modest) trick is that the output action must take place before we do the usual stuff we do for each line.




回答3:


You can do it all in bash :

cat $FILE | ( let countmax=4000; let count=countmax; while read lin ; do if [ $count == $countmax ]; then let count=0; echo -e "$countmax\n" ; fi ; echo $lin ; let count=count+1 ; done )

Here we assume you are reading this data from $FILE . Then all we are doing is reading from the file and piping it into our little bash script.

The bash script reads lines one by one (with the while read lin) , and increments the counter countfor each line. When starting or when the counter count reaches the value countmax (set to 4000) , then it prints out the 2 lines you asked for.



来源:https://stackoverflow.com/questions/26320609/how-to-insert-two-lines-for-every-data-frame-using-awk

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!