split a large text (xyz) database into x equal parts

后端 未结 1 742
旧时难觅i
旧时难觅i 2021-01-22 21:04

I want to split a large text database (~10 million lines). I can use a command like

$ sed -i -e \'4 s/(dB)//\' -e \'4 s/Best\\ unit/Best_Unit/\' -e \'1,3 d\' \'/         


        
相关标签:
1条回答
  • 2021-01-22 21:47

    If you want to just prepend the first line of the original file to all but the first of the splits, you can do something like:

    $ cat > a
    h
    1
    2
    3
    4
    5
    6
    7
    ^D
    $ split -l 3
    $ split -l 3 a 1
    $ ls
    1aa 1ab 1ac a
    $ mv 1aa 21aa
    $ for i in 1*; do head -n1 21aa|cat - $i > 2$i; done
    $ for i in 21*; do echo ---- $i; cat $i; done
    ---- 21aa
    h
    1
    2
    ---- 21ab
    h
    3
    4
    5
    ---- 21ac
    h
    6
    7
    

    Obviously, the first file will have one line less then the middle parts and the last part might be shorter, too, but if that's not a problem, this should work just fine. Of course, if your header has more lines, just change head -n1 to head -nX, X being the number of header lines.

    Hope this helps.

    0 讨论(0)
提交回复
热议问题