Split big file in unix based on size and pattern

前端 未结 1 818
情深已故
情深已故 2021-01-25 19:02

I have a huge file, 45 GB. I want to split it into 4 parts. I can do this by: split --bytes=12G inputfile.

Problem is it disturbs the pattern of the file. T

相关标签:
1条回答
  • 2021-01-25 19:26

    Doint it with sed would be pretty difficult, since you have no easy way of keeping track of the characters read so far. It would be easier with awk:

    BEGIN {
        fileno = 1
    }
    {
        size += length()
    }
    size > 100000 && /Inspecting/ {
        fileno++
        size = 0
    }
    {
        print $0 > "out" fileno;
    }
    

    Adjust the size according to your needs. awkmight have problems handling very large numbers. For this reason it might be better to keep track of the number of lines read so far.

    0 讨论(0)
提交回复
热议问题