I\'m trying to parse information from a XML file that contains a lot of elements with repeating names.
Here is an example of the type of file I am trying to parse, conta
I stored OP's XML in a file but duplicated the single record that was provided!
This could be slicker using some additional add-on packages (I would use dplyr
and the %>%
), but I held back. I do advise using xml2
instead of XML
. You can use XPATH expressions to target the nodes of interest.
x <- read_xml("so.xml")
(elements <- xml_find_all(x, ".//dict/dict/array/dict"))
#> {xml_nodeset (2)}
#> [1] \n IE_KEY_80211D_FIRST_CHANNEL \n ...
#> [2] \n IE_KEY_80211D_FIRST_CHANNEL \n ...
## isolate the key nodes ... will become variable names
keys <- lapply(elements, xml_find_all, "key")
keys <- lapply(keys, xml_text)
## I advise checking that keys are uniform across the records here!
(keys <- keys[[1]])
#> [1] "IE_KEY_80211D_FIRST_CHANNEL" "IE_KEY_80211D_MAX_POWER"
#> [3] "IE_KEY_80211D_NUM_CHANNELS"
## isolate integer data
integers <- lapply(y, xml_find_all, "integer")
integers <- lapply(integers, xml_text)
integers <- lapply(integers, type.convert)
yay <- as.data.frame(do.call(rbind, integers))
names(yay) <- keys
yay
#> IE_KEY_80211D_FIRST_CHANNEL IE_KEY_80211D_MAX_POWER
#> 1 1 27
#> 2 1 27
#> IE_KEY_80211D_NUM_CHANNELS
#> 1 11
#> 2 11