Parsing XML file with known structure and repeating elements

前端未结

关注

 2  1371

孤街浪徒 2021-01-29 03:11

I\'m trying to parse information from a XML file that contains a lot of elements with repeating names.

Here is an example of the type of file I am trying to parse, conta

2条回答

说谎 (楼主)

2021-01-29 03:47

I stored OP's XML in a file but duplicated the single record that was provided!

This could be slicker using some additional add-on packages (I would use dplyr and the %>%), but I held back. I do advise using xml2 instead of XML. You can use XPATH expressions to target the nodes of interest.

x <- read_xml("so.xml")
(elements <- xml_find_all(x, ".//dict/dict/array/dict"))
#> {xml_nodeset (2)}
#> [1] \n                    IE_KEY_80211D_FIRST_CHANNEL\n ...
#> [2] \n                    IE_KEY_80211D_FIRST_CHANNEL\n ...

## isolate the key nodes ... will become variable names
keys <- lapply(elements, xml_find_all, "key")
keys <- lapply(keys, xml_text)
## I advise checking that keys are uniform across the records here!
(keys <- keys[[1]])
#> [1] "IE_KEY_80211D_FIRST_CHANNEL" "IE_KEY_80211D_MAX_POWER"    
#> [3] "IE_KEY_80211D_NUM_CHANNELS"

## isolate integer data
integers <- lapply(y, xml_find_all, "integer")
integers <- lapply(integers, xml_text)
integers <- lapply(integers, type.convert)
yay <- as.data.frame(do.call(rbind, integers))
names(yay) <- keys
yay
#>   IE_KEY_80211D_FIRST_CHANNEL IE_KEY_80211D_MAX_POWER
#> 1                           1                      27
#> 2                           1                      27
#>   IE_KEY_80211D_NUM_CHANNELS
#> 1                         11
#> 2                         11

0 讨论(0)

查看其它2个回答