Correct regular expression for the input log

半世苍凉 提交于 2019-12-11 04:38:42

问题


Input log looks like this, which contains data which are "|" sperated. The data contains id | type | request | response

110000|read|<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:web="http://webservices.lookup.sdp.bharti.ibm.com">
<soapenv:Header/>
<soapenv:Bod<web:getLookUpServiceDetails>
<getLookUpService>
<serviceRequester>iOBD</serviceRequester>
<lineOfBusiness>mobility</lineOfBusiness>
<lookupAttribute>
<searchAttrValue>911425152231426</searchAttrValue>
</lookupAttribute>
</getLookUpService>
</web:getLookUpServiceDetails>
</soapenv:Body>
</soapenv:Envelope>|<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
<soapenv:Body>
<ns:getLookUpServiceDetailsResponse xmlns:ns="http://webservices.lookup.sdp.bharti.ibm.com">
<getLookUpServiceReturn>
<errorInfo>
<ErrorCode/>
<ErrorMessage/>
</errorInfo>
<lookupResponseList>
<mapEntry>
<attributeName>region</attributeName>
<attributeValue>["Micromax"]</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>msisdn</attributeName>
<attributeValue>"Maharashtra"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>imei</attributeName>
<attributeValue>"917756870222"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>imsi</attributeName>
<attributeValue>"911425152231426"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>device_vendor</attributeName>
<attributeValue>"404909092353805"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>device_type</attributeName>
<attributeValue>"E311"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>device_version</attributeName>
<attributeValue>"1"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>g3</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>mms</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>gprs</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>streaming</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>ota</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>wap</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>os</attributeName>
<attributeValue>"Google"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>version</attributeName>
<attributeValue>"4.4.2"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>camera</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>apn</attributeName>
<attributeValue>""AIRTELGPRS.COM,AIRTELMMS.COM""</attributeValue>
</mapEntry>
</lookupResponseList>
</getLookUpServiceReturn>
</ns:getLookUpServiceDetailsResponse>
</soapenv:Body>
</soapenv:Envelope>
210000|read|<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:web="http://webservices.lookup.sdp.bharti.ibm.com">
<soapenv:Header/>
<soapenv:Bod<web:getLookUpServiceDetails>
<getLookUpService>
<serviceRequester>iOBD</serviceRequester>
<lineOfBusiness>mobility</lineOfBusiness>
<lookupAttribute>
<searchAttrValue>911425152231426</searchAttrValue>
</lookupAttribute>
</getLookUpService>
</web:getLookUpServiceDetails>
</soapenv:Body>
</soapenv:Envelope>|<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
<soapenv:Body>
<ns:getLookUpServiceDetailsResponse xmlns:ns="http://webservices.lookup.sdp.bharti.ibm.com">
<getLookUpServiceReturn>
<errorInfo>
<ErrorCode/>
<ErrorMessage/>
</errorInfo>
<lookupResponseList>
<mapEntry>
<attributeName>region</attributeName>
<attributeValue>["Micromax"]</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>msisdn</attributeName>
<attributeValue>"Maharashtra"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>imei</attributeName>
<attributeValue>"917756870222"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>imsi</attributeName>
<attributeValue>"911425152231426"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>device_vendor</attributeName>
<attributeValue>"404909092353805"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>device_type</attributeName>
<attributeValue>"E311"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>device_version</attributeName>
<attributeValue>"1"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>g3</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>mms</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>gprs</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>streaming</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>ota</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>wap</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>os</attributeName>
<attributeValue>"Google"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>version</attributeName>
<attributeValue>"4.4.2"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>camera</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>apn</attributeName>
<attributeValue>""AIRTELGPRS.COM,AIRTELMMS.COM""</attributeValue>
</mapEntry>
</lookupResponseList>
</getLookUpServiceReturn>
</ns:getLookUpServiceDetailsResponse>
</soapenv:Body>
</soapenv:Envelope>
340000|read|<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:web="http://webservices.lookup.sdp.bharti.ibm.com">
<soapenv:Header/>
<soapenv:Bod<web:getLookUpServiceDetails>
<getLookUpService>
<serviceRequester>iOBD</serviceRequester>
<lineOfBusiness>mobility</lineOfBusiness>
<lookupAttribute>
<searchAttrValue>911425152231426</searchAttrValue>
</lookupAttribute>
</getLookUpService>
</web:getLookUpServiceDetails>
</soapenv:Body>
</soapenv:Envelope>|<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
<soapenv:Body>
<ns:getLookUpServiceDetailsResponse xmlns:ns="http://webservices.lookup.sdp.bharti.ibm.com">
<getLookUpServiceReturn>
<errorInfo>
<ErrorCode/>
<ErrorMessage/>
</errorInfo>
<lookupResponseList>
<mapEntry>
<attributeName>region</attributeName>
<attributeValue>["Micromax"]</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>msisdn</attributeName>
<attributeValue>"Maharashtra"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>imei</attributeName>
<attributeValue>"917756870222"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>imsi</attributeName>
<attributeValue>"911425152231426"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>device_vendor</attributeName>
<attributeValue>"404909092353805"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>device_type</attributeName>
<attributeValue>"E311"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>device_version</attributeName>
<attributeValue>"1"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>g3</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>mms</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>gprs</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>streaming</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>ota</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>wap</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>os</attributeName>
<attributeValue>"Google"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>version</attributeName>
<attributeValue>"4.4.2"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>camera</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>apn</attributeName>
<attributeValue>""AIRTELGPRS.COM,AIRTELMMS.COM""</attributeValue>
</mapEntry>
</lookupResponseList>
</getLookUpServiceReturn>
</ns:getLookUpServiceDetailsResponse>
</soapenv:Body>
</soapenv:Envelope>
450000|read|<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:web="http://webservices.lookup.sdp.bharti.ibm.com">
<soapenv:Header/>
<soapenv:Bod<web:getLookUpServiceDetails>
<getLookUpService>
<serviceRequester>iOBD</serviceRequester>
<lineOfBusiness>mobility</lineOfBusiness>
<lookupAttribute>
<searchAttrValue>911425152231426</searchAttrValue>
</lookupAttribute>
</getLookUpService>
</web:getLookUpServiceDetails>
</soapenv:Body>
</soapenv:Envelope>|<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
<soapenv:Body>
<ns:getLookUpServiceDetailsResponse xmlns:ns="http://webservices.lookup.sdp.bharti.ibm.com">
<getLookUpServiceReturn>
<errorInfo>
<ErrorCode/>
<ErrorMessage/>
</errorInfo>
<lookupResponseList>
<mapEntry>
<attributeName>region</attributeName>
<attributeValue>["Micromax"]</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>msisdn</attributeName>
<attributeValue>"Maharashtra"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>imei</attributeName>
<attributeValue>"917756870222"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>imsi</attributeName>
<attributeValue>"911425152231426"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>device_vendor</attributeName>
<attributeValue>"404909092353805"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>device_type</attributeName>
<attributeValue>"E311"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>device_version</attributeName>
<attributeValue>"1"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>g3</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>mms</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>gprs</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>streaming</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>ota</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>wap</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>os</attributeName>
<attributeValue>"Google"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>version</attributeName>
<attributeValue>"4.4.2"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>camera</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>apn</attributeName>
<attributeValue>""AIRTELGPRS.COM,AIRTELMMS.COM""</attributeValue>
</mapEntry>
</lookupResponseList>
</getLookUpServiceReturn>
</ns:getLookUpServiceDetailsResponse>
</soapenv:Body>
</soapenv:Envelope>
590000|read|<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:web="http://webservices.lookup.sdp.bharti.ibm.com">
<soapenv:Header/>
<soapenv:Bod<web:getLookUpServiceDetails>
<getLookUpService>
<serviceRequester>iOBD</serviceRequester>
<lineOfBusiness>mobility</lineOfBusiness>
<lookupAttribute>
<searchAttrValue>911425152231426</searchAttrValue>
</lookupAttribute>
</getLookUpService>
</web:getLookUpServiceDetails>
</soapenv:Body>
</soapenv:Envelope>|<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
<soapenv:Body>
<ns:getLookUpServiceDetailsResponse xmlns:ns="http://webservices.lookup.sdp.bharti.ibm.com">
<getLookUpServiceReturn>
<errorInfo>
<ErrorCode/>
<ErrorMessage/>
</errorInfo>
<lookupResponseList>
<mapEntry>
<attributeName>region</attributeName>
<attributeValue>["Micromax"]</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>msisdn</attributeName>
<attributeValue>"Maharashtra"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>imei</attributeName>
<attributeValue>"917756870222"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>imsi</attributeName>
<attributeValue>"911425152231426"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>device_vendor</attributeName>
<attributeValue>"404909092353805"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>device_type</attributeName>
<attributeValue>"E311"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>device_version</attributeName>
<attributeValue>"1"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>g3</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>mms</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>gprs</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>streaming</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>ota</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>wap</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>os</attributeName>
<attributeValue>"Google"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>version</attributeName>
<attributeValue>"4.4.2"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>camera</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>apn</attributeName>
<attributeValue>""AIRTELGPRS.COM,AIRTELMMS.COM""</attributeValue>
</mapEntry>
</lookupResponseList>
</getLookUpServiceReturn>
</ns:getLookUpServiceDetailsResponse>
</soapenv:Body>
</soapenv:Envelope>

desired output:

1st log:

id- 110000

type-read

request-<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:web="http://webservices.lookup.sdp.bharti.ibm.com">
<soapenv:Header/>
<soapenv:Bod<web:getLookUpServiceDetails>
<getLookUpService>
<serviceRequester>iOBD</serviceRequester>
<lineOfBusiness>mobility</lineOfBusiness>
<lookupAttribute>
<searchAttrValue>911425152231426</searchAttrValue>
</lookupAttribute>
</getLookUpService>
</web:getLookUpServiceDetails>
</soapenv:Body>
</soapenv:Envelope>

response-<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
<soapenv:Body>
<ns:getLookUpServiceDetailsResponse xmlns:ns="http://webservices.lookup.sdp.bharti.ibm.com">
<getLookUpServiceReturn>
<errorInfo>
<ErrorCode/>
<ErrorMessage/>
</errorInfo>
<lookupResponseList>
<mapEntry>
<attributeName>region</attributeName>
<attributeValue>["Micromax"]</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>msisdn</attributeName>
<attributeValue>"Maharashtra"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>imei</attributeName>
<attributeValue>"917756870222"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>imsi</attributeName>
<attributeValue>"911425152231426"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>device_vendor</attributeName>
<attributeValue>"404909092353805"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>device_type</attributeName>
<attributeValue>"E311"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>device_version</attributeName>
<attributeValue>"1"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>g3</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>mms</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>gprs</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>streaming</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>ota</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>wap</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>os</attributeName>
<attributeValue>"Google"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>version</attributeName>
<attributeValue>"4.4.2"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>camera</attributeName>
<attributeValue>"Yes"</attributeValue>
</mapEntry>
<mapEntry>
<attributeName>apn</attributeName>
<attributeValue>""AIRTELGPRS.COM,AIRTELMMS.COM""</attributeValue>
</mapEntry>
</lookupResponseList>
</getLookUpServiceReturn>
</ns:getLookUpServiceDetailsResponse>
</soapenv:Body>
</soapenv:Envelope>

for 2nd log :

id - 210000

type - read

request -

response - 

Similarly for the "n" no. of logs

configuration file used:

input {
  file {
    path => "/opt/test5/practice_new/final_xml.dat"
    start_position => "beginning"
    codec => multiline {
            pattern => "^%{NUMBER:method_id}\|%{DATA:method_type}\|<soapenv:Envelope>"
            negate => true
            what => previous
        }
  }
}
filter {
  grok {
    match => [ "message", "(?m)^(?<method_id>\d+)\|(?<method_type>\w+)\|(?<request><soapenv:Envelope>.*?</soapenv:Envelope>)\|(?<response><soapenv:Envelope>.*?</soapenv:Envelope>)" ]
  }
}

output {
   elasticsearch {
     hosts => "http://localhost:9200"
     index => "final"
  }
stdout {}
}

I tried using the regular expression in Grok but the current one is not working for the input logs.

please help me with the regular expression.


回答1:


The regex you currently are using is (?m)^(?<method_id>\d+)\|(?<method_type>\w+)\|(?<request><soapenv:Envelope>.*?</soapenv:Envelope>)\|(?<response><soapenv:Envelope>.*?</soapenv:Envelope>), and it can only parse out the 3rd and 4th columns if they start with <soapenv:Envelope> and end with </soapenv:Envelope> having | in between.

It seems you need a regex that will identify the 3rd column as a sequence of any chars other than | and the 4th column should gran any number of chars other than | up to the newline followed with 1 or more digits and then |.

Use

(?m)^(?<method_id>\d+)\|(?<method_type>\w+)\|(?<request>[^|]*)\|(?<response>[^|\n]*(?:\n(?!\d+\|)[^|\n]*)*)

See the regex demo.

Details

  • (?m) - the Ruby modifier that makes . match line break chars
  • ^ - start of a line
  • (?<method_id>\d+) - Group "method": one or more digits
  • \| - a pipe char
  • (?<method_type>\w+) - Group "method_type": one or more letters, digits or _
  • \| - a pipe
  • (?<request>[^|]*) - Group "request": any 0+ chars other than |
  • \| - a pipe
  • (?<response>[^|\n]*(?:\n(?!\d+\|)[^|\n]*)*) - Group "response":
    • [^|\n]* - any 0+ chars other than | and LF (newlines)
    • (?:\n(?!\d+\|)[^|\n]*)* - 0+ occurrences of:
      • \n - a newline
      • (?!\d+\|) - not followed with 1+ digits + |
      • [^|\n]* - any 0+ chars other than | and LF (newlines)


来源:https://stackoverflow.com/questions/45861957/correct-regular-expression-for-the-input-log

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!