Filter/grok method on logstash

问题

Supposed I have this log file:

Jan 1 22:54:17 drop   %LOGSOURCE% >eth1 rule: 7; rule_uid: {C1336766-9489-4049-9817-50584D83A245}; src: 70.77.116.190; dst: %DSTIP%; proto: tcp; product: VPN-1 & FireWall-1; service: 445; s_port: 2612;
Jan 1 22:54:22 drop   %LOGSOURCE% >eth1 rule: 7; rule_uid: {C1336766-9489-4049-9817-50584D83A245}; src: 61.164.41.144; dst: %DSTIP%; proto: udp; product: VPN-1 & FireWall-1; service: 5060; s_port: 5069;
Jan 1 22:54:23 drop   %LOGSOURCE% >eth1 rule: 7; rule_uid: {C1336766-9489-4049-9817-50584D83A245}; src: 69.55.245.136; dst: %DSTIP%; proto: tcp; product: VPN-1 & FireWall-1; service: 445; s_port: 2970;
Jan 1 22:54:41 drop   %LOGSOURCE% >eth1 rule: 7; rule_uid: {C1336766-9489-4049-9817-50584D83A245}; src: 95.104.65.30; dst: %DSTIP%; proto: tcp; product: VPN-1 & FireWall-1; service: 445; s_port: 2565;
Jan 1 22:54:43 drop   %LOGSOURCE% >eth1 rule: 7; rule_uid: {C1336766-9489-4049-9817-50584D83A245}; src: 222.186.24.11; dst: %DSTIP%; proto: tcp; product: VPN-1 & FireWall-1; service: 2967; s_port: 6000;
Jan 1 22:54:54 drop   %LOGSOURCE% >eth1 rule: 7; rule_uid: {C1336766-9489-4049-9817-50584D83A245}; src: 74.204.108.202; dst: %DSTIP%; proto: udp; product: VPN-1 & FireWall-1; service: 137; s_port: 53038;
Jan 1 22:55:10 drop   %LOGSOURCE% >eth1 rule: 7; rule_uid: {C1336766-9489-4049-9817-50584D83A245}; src: 71.111.186.26; dst: %DSTIP%; proto: tcp; product: VPN-1 & FireWall-1; service: 445; s_port: 38548;
Jan 1 23:02:56 accept %LOGSOURCE% >eth1 inzone: External; outzone: Local; rule: 3; rule_uid: {723F81EF-75C9-4CBB-8913-0EBB3686E0F7}; service_id: icmp-proto; ICMP: Echo Request; src: 24.188.22.101; dst: %DSTIP%; proto:

What filters/grok method can I implement for them to be separated into different fields? If I were to use semi colon as the separator, it would be different for the last row of data as there are more semi colons than other rows. Should I use a If else statement for it to separate?

回答1:

Looks like a typical use case for grok and kv filter.

First use the grok filter to separate your fields. Put the last part (key value pairs) into one field. Use the grok debugger to find the correct pattern. This might be an approach:

%{CISCOTIMESTAMP:timestamp} %{WORD:action}%{SPACE}%{DATA:logsource} %{DATA:interface} %{GREEDYDATA:kvpairs}

In logstash's config:

grok {
    match => [ 'message', '%{CISCOTIMESTAMP:timestamp} %{WORD:action}%{SPACE}%{DATA:logsource} %{DATA:interface} %{GREEDYDATA:kvpairs}' ]
}

Afterwards use the kv filter to split the key value pairs. Something like this might work:

kv {
    source => "kvpairs" # new field generated by grok before
    field_split => "; " # split fields by semicolon
}

Try it and maybe adjust it a little bit and you should be able to parse all log lines correctly.

来源：https://stackoverflow.com/questions/32042099/filter-grok-method-on-logstash

标签

csv

logstash-grok

logstash-configuration