问题
I'm running latest ELK stack 6.6 on deviantony/docker-elk image. I have the following XML file which I try to parse into ES JSON object:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<ChainId>7290027600007</ChainId>
<SubChainId>001</SubChainId>
<StoreId>001</StoreId>
<BikoretNo>9</BikoretNo>
<DllVerNo>8.0.1.3</DllVerNo>
</root>
My conf file is:
input {
file {
path => "/usr/share/logstash/logs/example1.xml"
type => "xml"
start_position => "beginning"
sincedb_path => "/dev/null"
codec => multiline {
pattern => "<?xml version"
negate => true
what => "previous"
}
}
}
filter {
xml {
source => "message"
store_xml => false
xpath => [ "/root/ChainId/text()", "ChainId" ]
}
}
output {
elasticsearch {
hosts => "elasticsearch:9200"
index => "xml_index"
manage_template => false
}
}
My Logstash output is:
{ logstash_1 | "@timestamp" => 2019-03-26T06:45:27.941Z, logstash_1 | "tags" => [ logstash_1 | [0] "multiline" logstash_1 | ], logstash_1 | "host" => "751b3a8bf341", logstash_1 | "ChainId" => [], logstash_1 | "message" => "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\r\n<root>\r\n <ChainId>7290027600007</ChainId>\r\n <SubChainId>001</SubChainId>\r\n <StoreId>001</StoreId>\r\n <BikoretNo>9</BikoretNo>\r\n <DllVerNo>8.0.1.3</DllVerNo>\r\n</root>\r", logstash_1 | "path" => "/usr/share/logstash/logs/example1.xml", logstash_1 | "@version" => "1", logstash_1 | "type" => "xml" logstash_1 | }
XML body under message is showing as a string with escaping and \r\n
. XPathChainId
field returns empty array. I tried with other XML files as well with same results.
Update:
After trying to remove \r\n
still not getting XPath parsed fields. My output is:
logstash_1 | "message" => "<?xml version=\"1.0\" encoding=\"UTF-8\"?><root> <ChainId>7290027600007</ChainId> <SubChainId>001</SubChainId> <StoreId>001</StoreId> <BikoretNo>9</BikoretNo> <DllVerNo>8.0.1.3</DllVerNo>", logstash_1 | "StoreId" => [], logstash_1 | "BikoretNo" => [], logstash_1 | "ChainId" => [], logstash_1 | "type" => "xml", logstash_1 | "tags" => [ logstash_1 | [0] "multiline" logstash_1 | ], logstash_1 | "@timestamp" => 2019-03-27T20:51:09.575Z, logstash_1 | "DllVerNo" => [], logstash_1 | "path" => "/usr/share/logstash/logs/example1.xml", logstash_1 | "host" => "751b3a8bf341", logstash_1 | "SubChainId" => [], logstash_1 | "@version" => "1" logstash_1 | }
回答1:
Please use gsub
mutate filter to remove special character from message.
mutate {
gsub => [ "message", "[\r\n]", "" ]
}
Add target setting to xml filter for placing the data.
filter {
xml{
source => "message"
store_xml => false
target => "root"
}
}
Here is the complete working logstash conf file.
input
{
file
{
path => "C:\Users\KZAPAGOL\Desktop\CSV\XMLFile.xml"
start_position => "beginning"
sincedb_path => "/dev/null"
exclude => "*.gz"
type => "xml"
codec => multiline {
pattern => "<?xml "
negate => "true"
what => "previous"
}
}
}
filter {
xml{
source => "message"
store_xml => false
target => "root"
xpath => [
"/root/ChainId/text()", "ChainId",
"/root/SubChainId/text()", "SubChainId",
"/root/StoreId/text()", "StoreId",
"/root/BikoretNo/text()", "BikoretNo",
"/root/DllVerNo/text()", "DllVerNo"
]
}
mutate {
gsub => [ "message", "[\r\n]", "" ]
}
}
output{
elasticsearch{
hosts => ["http://localhost:9200/"]
index => "parse_xml"
}
stdout
{
codec => rubydebug
}
}
Output
{
"_index": "parse_xml",
"_type": "doc",
"_id": "vNj4v2kBZ2Q_C9FO94eF",
"_version": 1,
"_score": null,
"_source": {
"@timestamp": "2019-03-27T16:25:58.379Z",
"path": "filePath",
"tags": [
"multiline"
],
"ChainId": [
"7290027600007"
],
"BikoretNo": [
"9"
],
"DllVerNo": [
"8.0.1.3"
],
"host": "xxxx",
"@version": "1",
"SubChainId": [
"001"
],
"message": "<?xml version=\"1.0\" encoding=\"UTF-8\"?><root> <ChainId>7290027600007</ChainId> <SubChainId>001</SubChainId> <StoreId>001</StoreId> <BikoretNo>9</BikoretNo> <DllVerNo>8.0.1.3</DllVerNo></root>",
"type": "xml",
"StoreId": [
"001"
]
},
"fields": {
"@timestamp": [
"2019-03-27T16:25:58.379Z"
]
},
"sort": [
1553703958379
]
}
回答2:
I tried you configuration and it works in windows environment, It hapends with me one time and I changed the xpath expression
Try to change the xpath expression to one of below
xpath => [ "//*[local-name() = 'ChainId']/text()", "ChainId" ]
OR
xpath => [ "//ChainId/text()", "ChainId" ]
回答3:
My XML files were encoded with UTF-8 BOM instead of UTF-8. Problem solved!
来源:https://stackoverflow.com/questions/55365566/logstash-xml-parse-failed