How do I convert an XML body to a hash in Ruby?
I have an XML body which I\'d like to parse into a hash
The original question was asked some time ago, but I found a simpler solution than using Nokogiri and searching for specific names in the XML.
Nori.parse(your_xml)
will parse the XML into a hash and the keys will have the same names as your XML items.
If you don't mind using a gem, crack does a pretty good job at this.
Crack does the XML to hash processing, then you can loop over the resulting hash to normalize the datetimes.
edit Using REXML, you could try the following (should be close to working, but I do not have access to a terminal so it may need some tweaking):
require 'rexml/document'
arr = []
doc = REXML::XPath.first(REXML::Document.new(xml), "//soap:Body/TimesInMyDAY").text
REXML::XPath.each(doc, "//TIME_DATA") do |el|
start = REXML::XPath.first(el, "//StartTime").text
end = REXML::XPath.first(el, "//EndTime").text
arr.push({:start_time => Time.parse(start).in_time_zone(current_user.time_zone), :end_time => Time.parse(end).in_time_zone(current_user.time_zone)})
end
hash = { :times_in_my_day => { :time_data => arr } }
Of course, this assumes the structure is ALWAYS the same, and that the example you posted was not contrived for simplicity sake (as examples often are).
I used to use XML::Simple in Perl because parsing XML using Perl was a PITA.
When I switched to Ruby I ended up using Nokogiri, and found it to be very easy to use for parsing HTML and XML. It's so easy that I think in terms of CSS or XPath selectors and don't miss a XML-to-hash converter.
require 'ap'
require 'date'
require 'time'
require 'nokogiri'
xml = %{
<soap:Body>
<TimesInMyDAY>
<TIME_DATA>
<StartTime>2010-11-10T09:00:00</StartTime>
<EndTime>2010-11-10T09:20:00</EndTime>
</TIME_DATA>
<TIME_DATA>
<StartTime>2010-11-10T09:20:00</StartTime>
<EndTime>2010-11-10T09:40:00</EndTime>
</TIME_DATA>
<TIME_DATA>
<StartTime>2010-11-10T09:40:00</StartTime>
<EndTime>2010-11-10T10:00:00</EndTime>
</TIME_DATA>
<TIME_DATA>
<StartTime>2010-11-10T10:00:00</StartTime>
<EndTime>2010-11-10T10:20:00</EndTime>
</TIME_DATA>
<TIME_DATA>
<StartTime>2010-11-10T10:40:00</StartTime>
<EndTime>2010-11-10T11:00:00</EndTime>
</TIME_DATA>
</TimesInMyDAY>
</soap:Body>
}
time_data = []
doc = Nokogiri::XML(xml)
doc.search('//TIME_DATA').each do |t|
start_time = t.at('StartTime').inner_text
end_time = t.at('EndTime').inner_text
time_data << {
:start_time => DateTime.parse(start_time),
:end_time => Time.parse(end_time)
}
end
puts time_data.first[:start_time].class
puts time_data.first[:end_time].class
ap time_data[0, 2]
with the output looking like:
DateTime
Time
[
[0] {
:start_time => #<DateTime: 2010-11-10T09:00:00+00:00 (19644087/8,0/1,2299161)>,
:end_time => 2010-11-10 09:20:00 -0700
},
[1] {
:start_time => #<DateTime: 2010-11-10T09:20:00+00:00 (22099598/9,0/1,2299161)>,
:end_time => 2010-11-10 09:40:00 -0700
}
]
The time values are deliberately parsed into DateTime and Time objects to show that either could be used.
Hash.from_xml(xml)
is simple way to solve this. Its activesupport method
ActiveSupport adds a Hash.from_xml
, which does the conversion in a single call. Described in another question: https://stackoverflow.com/a/7488299/937595
Example:
require 'open-uri'
remote_xml_file = "https://www.example.com/some_file.xml"
data = Hash.from_xml(open(remote_xml_file))