Calculate time based metrics(hourly)

后端 未结 2 1754
感动是毒
感动是毒 2021-01-03 00:24

How would I calculate time-based metrics (hourly average) based on log file data?

let me make this more clear, consider a log file that contains entries as follows:

相关标签:
2条回答
  • 2021-01-03 00:50

    Extending Ed Morton's solution:

    Content of script.awk

    function parse_time (date, time,        newtime) {
        gsub(/-/, " ", date)
        gsub(/:/, " ", time)
        gsub(/,.*/, "", time)
        newtime = date" "time
        return newtime
    }
    
    (gensub(/.*;gt;([^&]+).*/,"\\1","") in starttime) {
        etime = parse_time($1, $2)
        endtime[gensub(/.*;gt;([^&]+).*/,"\\1","")] = etime
        next
    }
    {
        stime = parse_time($1, $2)
        starttime[gensub(/.*;gt;([^&]+).*/,"\\1","")] = stime
    }
    
    END {
        for (x in starttime) {
            for (y in endtime) {
                if (x==y) {
                    diff = mktime(endtime[x]) - mktime(starttime[y])
                    diff = sprintf("%dh:%dm:%ds",diff/(60*60),diff%(60*60)/60,diff%60)
                    print x, diff
                    delete starttime[x]
                    delete endtime[y]
                 }
            }
        }
    }
    

    Test: Modified the order of guid for testing

    $ cat log.file 
    2013-04-03 08:54:19,989 INFO [LOGGER] <?xml version="1.0" encoding="UTF-8" standalone="yes"?><event><body>&amp;lt;UId&amp;gt;904c-be-4e-bbda-3e62&amp;lt;/UId&amp;gt;&amp;lt;</body></event>
    2013-04-03 08:54:34,979 INFO [LOGGER] <?xml version="1.0" encoding="UTF-8" standalone="yes"?><event><body>&amp;lt;UId&amp;gt;edfc-fr-5e-bced-3443&amp;lt;/UId&amp;gt;&amp;lt;</body></event>
    2013-04-03 08:54:39,389 INFO [LOGGER] <?xml version="1.0" encoding="UTF-8" standalone="yes"?><event><body>&amp;lt;UId&amp;gt;904c-be-4e-bbda-3e62&amp;lt;/UId&amp;gt;&amp;lt;</body></event>
    2013-04-03 08:55:19,569 INFO [LOGGER] <?xml version="1.0" encoding="UTF-8" standalone="yes"?><event><body>&amp;lt;UId&amp;gt;edfc-fr-5e-bced-3443&amp;lt;/UId&amp;gt;&amp;lt;</body></event>
    $ awk -f script.awk log.file 
    904c-be-4e-bbda-3e62 0h:0m:20s
    edfc-fr-5e-bced-3443 0h:0m:45s
    
    0 讨论(0)
  • 2021-01-03 01:08

    Given your posted input file:

    $ cat file
    2013-04-03 08:54:19,989 INFO [LOGGER] <?xml version="1.0" encoding="UTF-8" standalone="yes"?><event><body>&amp;lt;UId&amp;gt;904c-be-4e-bbda-3e62&amp;lt;/UId&amp;gt;&amp;lt;</body></event>
    2013-04-03 08:54:39,389 INFO [LOGGER] <?xml version="1.0" encoding="UTF-8" standalone="yes"?><event><body>&amp;lt;UId&amp;gt;904c-be-4e-bbda-3e62&amp;lt;/UId&amp;gt;&amp;lt;</body></event>
    2013-04-03 08:54:34,979 INFO [LOGGER] <?xml version="1.0" encoding="UTF-8" standalone="yes"?><event><body>&amp;lt;UId&amp;gt;edfc-fr-5e-bced-3443&amp;lt;/UId&amp;gt;&amp;lt;</body></event>
    2013-04-03 08:55:19,569 INFO [LOGGER] <?xml version="1.0" encoding="UTF-8" standalone="yes"?><event><body>&amp;lt;UId&amp;gt;edfc-fr-5e-bced-3443&amp;lt;/UId&amp;gt;&amp;lt;</body></event>
    

    This GNU awk script (you are using GNU awk since you set RS to a multi-character string in the script you posted in your question)

    $ cat tst.awk
    {
        date = $1
        time = $2
        guid = gensub(/.*;gt;([^&]+).*/,"\\1","")
    
        print guid, date, time
    }
    

    will pull out what I THINK is the information you care about:

    $ gawk -f tst.awk file
    904c-be-4e-bbda-3e62 2013-04-03 08:54:19,989
    904c-be-4e-bbda-3e62 2013-04-03 08:54:39,389
    edfc-fr-5e-bced-3443 2013-04-03 08:54:34,979
    edfc-fr-5e-bced-3443 2013-04-03 08:55:19,569
    

    The rest is simple math, right? And do it in this awk script - don't go piping the awk output to some goofy shell loop!

    0 讨论(0)
提交回复
热议问题