I have a huge file, 45 GB. I want to split it into 4 parts. I can do this by: split --bytes=12G inputfile
.
Problem is it disturbs the pattern of the file. T
Doint it with sed
would be pretty difficult, since you have no easy way of keeping track of the characters read so far. It would be easier with awk
:
BEGIN {
fileno = 1
}
{
size += length()
}
size > 100000 && /Inspecting/ {
fileno++
size = 0
}
{
print $0 > "out" fileno;
}
Adjust the size according to your needs. awk
might have problems handling very large numbers. For this reason it might be better to keep track of the number of lines read so far.