Merging multiple log files by date including multilines

后端 未结 5 1400
误落风尘
误落风尘 2021-02-07 13:15

I have several logs containing lines all starting with a timestamp, so that the following works as expected to merge them:

cat myLog1.txt myLog2.txt | sort -n &g         


        
5条回答
  •  野的像风
    2021-02-07 14:20

    Here's one way to do it in a bash shell with simple merging of the files (rather than expensive resorting - as log files are already sorted). This is important for huge files in the hundreds of megabytes or more, as often is the case with real world log files.

    This solution assumes that there are no NUL bytes in your logs, which is true for every log file that I've come across, with various character sets.

    The basic idea:

    1. Concat all multilines to single lines by replacing those newlines by NUL in each input file
    2. Do a sort -m on the replaced files to merge them
    3. Replace NUL back to newlines on the merged result

    As the first step is done multiple times, I've given it an alias:

    alias a="awk '{ if (match(\$0, /^[0-9]{2}:[0-9]{2}:[0-9]{2}\\./, _))\
    { if (NR == 1) printf \"%s\", \$0; else printf \"\\n%s\", \$0 }\
    else printf \"\\0%s\", \$0 } END { print \"\" }'"
    

    Here's the command that performs all 3 steps:

    sort -m <(a myLog1.txt) <(a myLog2.txt) | tr '\0' '\n'
    

    For more, see https://superuser.com/a/838446/125379

提交回复
热议问题