Write xml-object to disk

前端 未结 1 690
南笙
南笙 2020-12-06 08:36

I have a big bunch of xml-files, which I need to process. For that matter I want to be able to read the files, and save the resulting list of objects to disk. I

相关标签:
1条回答
  • 2020-12-06 08:52

    xml2 objects have external pointers that become invalid when you serialize them naively. The package provides xml_serialize() and xml_unserialize() objects to handle this for you. Unfortunately the API is slightly cumbersome because base::serialize() and base::unserialize() assume an open connection.


    library(xml2)
    
    x <- read_xml("<foo>
                  <bar>text <baz id = 'a' /></bar>
                  <bar>2</bar>
                  <baz id = 'b' />
                  </foo>")
    
    # function to save and read object
    roundtrip <- function(obj) {
      tf <- tempfile()
      con <- file(tf, "wb")
      on.exit(unlink(tf))
    
      xml_serialize(obj, con)
      close(con)
      con <- file(tf, "rb")
      on.exit(close(con), add = TRUE)
      xml_unserialize(con)
    }
    x
    #> {xml_document}
    #> <foo>
    #> [1] <bar>text <baz id="a"/></bar>
    #> [2] <bar>2</bar>
    #> [3] <baz id="b"/>
    (y <- roundtrip(x))
    #> {xml_document}
    #> <foo>
    #> [1] <bar>text <baz id="a"/></bar>
    #> [2] <bar>2</bar>
    #> [3] <baz id="b"/>
    
    identical(x, y)
    #> [1] FALSE
    all.equal(x, y)
    #> [1] TRUE
    xml_children(y)
    #> {xml_nodeset (3)}
    #> [1] <bar>text <baz id="a"/></bar>
    #> [2] <bar>2</bar>
    #> [3] <baz id="b"/>
    as_list(y)
    #> $bar
    #> $bar[[1]]
    #> [1] "text "
    #> 
    #> $bar$baz
    #> list()
    #> attr(,"id")
    #> [1] "a"
    #> 
    #> 
    #> $bar
    #> $bar[[1]]
    #> [1] "2"
    #> 
    #> 
    #> $baz
    #> list()
    #> attr(,"id")
    #> [1] "b"
    

    Also in regards to the second part of your question, I would seriously consider using XPATH expressions to extract the desired data, even if you have to rewrite code.

    0 讨论(0)
提交回复
热议问题