I have ELK configured for collecting data offline, the log files look something like this :
Info 2015-08-15 09:33:37,522 User 3 connected
Info 2015-08-15 10:
If you're loading your ES with Logstash, you can use the aggregate filter in order to assemble discrete log lines that are correlated. The idea is to notice when a long-lasting event starts (i.e. User connected) and then end it when the disconnected
event for the same user flies by: (note that your grok pattern might differ, but the principle is the same)
filter {
grok {
match => [ "message", "%{LOGLEVEL:loglevel} %{TIMESTAMP_ISO8601:timestamp} %{WORD:entity} %{INT:userid} %{WORD:status}" ]
}
if [status] == "connected" {
aggregate {
task_id => "%{userid}"
code => "map['started'] = event['timestamp']"
map_action => "create"
}
}
if [status] == "disconnected" {
aggregate {
task_id => "%{userid}"
code => "event['duration'] = event['timestamp'] - map['started']"
map_action => "update"
end_of_task => true
timeout => 86400000
}
}
}
You'll end up with and additional field called duration
(in milliseconds) which you can then use to plot on Kibana for showing the average connection time.
Also note that I'm giving an arbitrary timeout of one day, which might or might not suit your case. Feel free to play around.
One of the downsides of Elasticsearch is that it isn't a relational database - so cross referencing is much more limited. There's a nice blog post about it: Managing Relations inside Elasticsearch
But the long and short of it is - there isn't a way to query this sort of thing directly. Each event is a discrete document within your index, and there isn't any sort of cross reference.
So you have to do it the hard way. At the simplest level - that's query all the connect events, query all the disconnect events, and correlate them yourself with a scripting language.
You can make this slightly easier by pre-filtering your logs with a grok
filter to add fields to your database.
if [type] == "syslog" and [message] =~ /connected/ {
grok {
match => [ "message", "User %{POSINT:userid} %{WORD:conn}" ]
}
}
Which will add a userid
and conn
field (containing "connected" or "disconnected").
But you will still have to manually correlate the queries with a database fetch using your favourite scripting language (so could just do the 'search and filter' in the script).