Write xml-object to disk

烈酒焚心 提交于 2019-11-28 10:32:18

xml2 objects have external pointers that become invalid when you serialize them naively. The package provides xml_serialize() and xml_unserialize() objects to handle this for you. Unfortunately the API is slightly cumbersome because base::serialize() and base::unserialize() assume an open connection.


library(xml2)

x <- read_xml("<foo>
              <bar>text <baz id = 'a' /></bar>
              <bar>2</bar>
              <baz id = 'b' />
              </foo>")

# function to save and read object
roundtrip <- function(obj) {
  tf <- tempfile()
  con <- file(tf, "wb")
  on.exit(unlink(tf))

  xml_serialize(obj, con)
  close(con)
  con <- file(tf, "rb")
  on.exit(close(con), add = TRUE)
  xml_unserialize(con)
}
x
#> {xml_document}
#> <foo>
#> [1] <bar>text <baz id="a"/></bar>
#> [2] <bar>2</bar>
#> [3] <baz id="b"/>
(y <- roundtrip(x))
#> {xml_document}
#> <foo>
#> [1] <bar>text <baz id="a"/></bar>
#> [2] <bar>2</bar>
#> [3] <baz id="b"/>

identical(x, y)
#> [1] FALSE
all.equal(x, y)
#> [1] TRUE
xml_children(y)
#> {xml_nodeset (3)}
#> [1] <bar>text <baz id="a"/></bar>
#> [2] <bar>2</bar>
#> [3] <baz id="b"/>
as_list(y)
#> $bar
#> $bar[[1]]
#> [1] "text "
#> 
#> $bar$baz
#> list()
#> attr(,"id")
#> [1] "a"
#> 
#> 
#> $bar
#> $bar[[1]]
#> [1] "2"
#> 
#> 
#> $baz
#> list()
#> attr(,"id")
#> [1] "b"

Also in regards to the second part of your question, I would seriously consider using XPATH expressions to extract the desired data, even if you have to rewrite code.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!