问题
The text file has hundreds of these entries (format is MT940 bank statement)
{1:F01AHHBCH110XXX0000000000}{2:I940X N2}{3:{108:XBS/091502}}{4:
:20:XBS/091202/0001
:25:5887/507004-50
:28C:140/1
:60F:C0914CHF7789,
:61:0912021202D36,80NTRFNONREF//0887-1202-29-941
04392579-0 LUTHY + xxx, ZUR
:86:6034?60LUTHY + xxxx, ZUR vom 01.12.09 um 16:28 Karten-Nr. 2232
2579-0
:62F:C091202CHF52,2
:64:C091302CHF52,2
-}
This should go into an Array of Hashes like
[{"1"=>"F01AHHBCH110XXX0000000000"},
"2"=>"I940X N2",
3 => {108=>"XBS/091502"}
etc.
} ]
I tried it with tree top, but it seemed not to be the right way, because it's more for something you want to do calculations on, and I just want the information.
grammar Mt940
rule document
part1:string spaces [:|/] spaces part2:document
{
def eval(env={})
return part1.eval, part2.eval
end
}
/ string
/ '{' spaces document spaces '}' spaces
{
def eval(env={})
return [document.eval]
end
}
end
end
I also tried with a regular expression
matches = str.scan(/\A[{]?([0-9]+)[:]?([^}]*)[}]?\Z/i)
but it's difficult with recursion ...
How can I solve this problem?
回答1:
There are several open source MT940 parsers available in Java and PHP. You can look at the source code and port it to Ruby. If you are on JRuby then you can use the java parser in your ruby code.
Other option is to use the OFX gem. The gem parses OFX files. Since your file is in MT940 format, you have to convert the file to OFX format using one of the free converters available. This approach is practical if you are importing in a batch job etc.
Reference
MT940 Java parser.
MT940 to OFX Converter 1
MT940 to OFX Converter 2
来源:https://stackoverflow.com/questions/2459292/best-way-to-parse-plain-text-file-with-a-nested-information-structure