How would I calculate time-based metrics (hourly average) based on log file data?
let me make this more clear, consider a log file that contains entries as follows:
Given your posted input file:
$ cat file
2013-04-03 08:54:19,989 INFO [LOGGER] <UId>904c-be-4e-bbda-3e62</UId><
2013-04-03 08:54:39,389 INFO [LOGGER] <UId>904c-be-4e-bbda-3e62</UId><
2013-04-03 08:54:34,979 INFO [LOGGER] <UId>edfc-fr-5e-bced-3443</UId><
2013-04-03 08:55:19,569 INFO [LOGGER] <UId>edfc-fr-5e-bced-3443</UId><
This GNU awk script (you are using GNU awk since you set RS to a multi-character string in the script you posted in your question)
$ cat tst.awk
{
date = $1
time = $2
guid = gensub(/.*;gt;([^&]+).*/,"\\1","")
print guid, date, time
}
will pull out what I THINK is the information you care about:
$ gawk -f tst.awk file
904c-be-4e-bbda-3e62 2013-04-03 08:54:19,989
904c-be-4e-bbda-3e62 2013-04-03 08:54:39,389
edfc-fr-5e-bced-3443 2013-04-03 08:54:34,979
edfc-fr-5e-bced-3443 2013-04-03 08:55:19,569
The rest is simple math, right? And do it in this awk script - don't go piping the awk output to some goofy shell loop!