Reorder lines near the beginning of a huge text file (>20G)

狂风中的少年 提交于 2019-12-05 00:57:00

问题


I am a vim user and can use some basic awk or bash commands. Now I have a text (vcf) file with size more than 20G. What I wanted is to move the line #69 to below line#66:

$less huge.vcf
...
    66 ##contig=<ID=9,length=124595110>                                                                                                                                                       
    67 ##contig=<ID=X,length=171031299>                                                                                                                                                       
    68 ##contig=<ID=Y,length=91744698>                                                                                                                                                        
    69 ##contig=<ID=MT,length=16299>
...

What I wanted is:

...
    66 ##contig=<ID=9,length=124595110>     
    67 ##contig=<ID=MT,length=16299>                                                                                                                                                  
    68 ##contig=<ID=X,length=171031299>                                                                                                                                                       
    69 ##contig=<ID=Y,length=91744698>                                                                                                                                                        
...

I tried to open and edit it using vim (LargeFile plugin installed), but still not working very well.


回答1:


The easy approach is to copy the section you want to edit out of your file, modify it in-place, then copy it back in.

# extract the first hundred lines
head -n 100 huge.txt >start.txt

# modify that extracted subset
vim start.txt

# copy that section back into the beginning of larger file
dd if=start.txt of=huge.txt conv=notrunc

Note that this only works if your edits don't change the size of the section being modified. That is to say -- make sure that start.txt has the exact same size in bytes after being modified that it had before.




回答2:


Here's an awk version:

$ awk 'NR>=3 && NR<=4{b=b (b==""?"":ORS) $0;next}1;NR==5 {print b}' file
...
    66 ##contig=<ID=9,length=124595110>
    69 ##contig=<ID=MT,length=16299>
    67 ##contig=<ID=X,length=171031299>
    68 ##contig=<ID=Y,length=91744698>
...

You need to change the line numbers in the code, though. 3 -> 67, 4 -> 68 and 5 -> 69 and redirect the output to a new file. If you' like it to perform inplace, use i inplace for GNU awk.



来源:https://stackoverflow.com/questions/43987897/reorder-lines-near-the-beginning-of-a-huge-text-file-20g

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!