In R, how to extracting two values from XML file, looping over 5603 files and write to table

前端 未结 2 822
轻奢々
轻奢々 2021-01-06 13:21

As I am rather new to R, I am trying to learn how I can extract two values from a XML file and loop over 5603 other (small, <2kb) XML files in my working directory.

2条回答
  •  生来不讨喜
    2021-01-06 13:40

    This might work for you. I got rid of the for loop and went with sapply.

    xmlfiles <- list.files(pattern = "*.xml")
    txtfiles <- gsub("xml", "txt", xmlfiles, fixed = TRUE)
    

    txtfiles is a set of new file names to be used as the output file for each run.

    sapply(seq(xmlfiles), function(i){
    
      doc <- xmlTreeParse(xmlfiles[i], useInternal = TRUE)
      zipcode <- xmlValue(doc[["//ZipCode"]])
      amount <- xmlValue(doc[["//AwardAmount"]])
      DF <- data.frame(zip = zipcode, amount = amount)
      write.table(DF, quote = FALSE, row.names = FALSE, file = txtfiles[i])
    
    })
    

    Please, let me know if there are issues when you run it.

提交回复
热议问题