How to process multiline log entry with logstash filter?

后端 未结 4 1256
终归单人心
终归单人心 2020-12-14 01:26

Background:

I have a custom generated log file that has the following pattern :

[2014-03-02 17:34:20] - 127.0.0.1|ERROR| E:\\xampp\\htdocs\\test.ph         


        
相关标签:
4条回答
  • 2020-12-14 01:50

    I went through the source code and found out that :

    • The multiline filter will cancel all the events that are considered to be a follow up of a pending event, then append that line to the original message field, meaning any filters that are after the multiline filter won't apply in this case
    • The only event that will ever pass the filter, is one that is considered to be a new one ( something that start with [ in my case )

    Here is the working code :

    input {
       stdin{}
    }  
    
    filter{
          if "|ERROR|" in [message]{ #if this is the 1st message in many lines message
          grok{
            match => ['message',"\[.+\] - %{IP:ip}\|%{LOGLEVEL:loglevel}\| %{PATH:file}\|%{NUMBER:line}\|%{WORD:tag}\|%{GREEDYDATA:content}"]
          }
    
          mutate {
            replace => [ "message", "%{content}" ] #replace the message field with the content field ( so it auto append later in it )
            remove_field => ["content"] # we no longer need this field
          }
        }
    
        multiline{ #Nothing will pass this filter unless it is a new event ( new [2014-03-02 1.... )
            pattern => "^\["
            what => "previous"
            negate=> true
        }
    
        if "|DEBUG| flush_multi_line" in [message]{
          drop{} # We don't need the dummy line so drop it
        }
    }
    
    output {
      stdout{ debug=>true }
    }
    

    Cheers,

    Abdou

    0 讨论(0)
  • 2020-12-14 01:54

    Isn't the issue simply the ordering of the filters. Order is very important to log stash. You don't need another line to indicate that you've finished outputting multiline log line. Just ensure multiline filter appears first before the grok (see below)

    P.s. I've managed to parse a multiline log line fine where xml was appended to end of log line and it spanned multiple lines and still I got a nice clean xml object into my content equivalent variable (named xmlrequest below). Before you say anything about logging xml in logs... I know... its not ideal... but that's for another debate :)):

    filter { 
    multiline{
            pattern => "^\["
            what => "previous"
            negate=> true
        }
    
    mutate {
        gsub => ['message', "\n", " "]
    }
    
    mutate {
        gsub => ['message', "\r", " "]
    }
    
    grok{
            match => ['message',"\[%{WORD:ONE}\] \[%{WORD:TWO}\] \[%{WORD:THREE}\] %{GREEDYDATA:xmlrequest}"]
        }
    
    xml {
    source => xmlrequest
    remove_field => xmlrequest
    target => "request"
      }
    }
    
    0 讨论(0)
  • 2020-12-14 02:00

    grok and multiline handling is mentioned in this issue https://logstash.jira.com/browse/LOGSTASH-509

    Simply add "(?m)" in front of your grok regex and you won't need mutation. Example from issue:

    pattern => "(?m)<%{POSINT:syslog_pri}>(?:%{SPACE})%{GREEDYDATA:message_remainder}"
    
    0 讨论(0)
  • 2020-12-14 02:04

    The multiline filter will add the "\n" to the message. For example:

    "[2014-03-02 17:34:20] - 127.0.0.1|ERROR| E:\\xampp\\htdocs\\test.php|123|subject|The error message goes here ; array (\n  'create' => \n  array (\n    'key1' => 'value1',\n    'key2' => 'value2',\n    'key3' => 'value3'\n  ),\n)"
    

    However, the grok filter can't parse the "\n". Therefore you need to substitute the \n to another character, says, blank space.

    mutate {
        gsub => ['message', "\n", " "]
    }
    

    Then, grok pattern can parse the message. For example:

     "content" => "The error message goes here ; array (   'create' =>    array (     'key1' => 'value1',     'key2' => 'value2',     'key3' => 'value3'   ), )"
    
    0 讨论(0)
提交回复
热议问题