In R, how can I loop over repeated XML nodes, and save text values in a list?

后端 未结 3 1838
独厮守ぢ
独厮守ぢ 2021-02-10 23:43

I\'m working with XML files from clinicaltrials.gov, which have a structure like this:


  ...
  
  ...
  

        
相关标签:
3条回答
  • 2021-02-11 00:17

    Here is an example

     ns <- getNodeSet(xml, '//clinical_results/outcome_list/outcome/analysis_list/analysis/method')
     element_cnt <-length(ns))
     strings<-paste(sapply(ns, function(x) { xmlValue(x) }),collapse="|"))
    
    0 讨论(0)
  • 2021-02-11 00:20

    This code will put a subset of nodes that correspond to <location> from a clinical trial into a data frame:

    library(XML)
    clinicalTrialUrl <- "http://clinicaltrials.gov/ct2/show/NCT01480479?resultsxml=true"
    xmlDoc <- xmlParse(clinicalTrialUrl, useInternalNode=TRUE)
    locations <- xmlToDataFrame(getNodeSet(xmlDoc,"//location"))
    

    In this case there are 221 locations. However, the code assumes sort of a flat structure and lumps subnodes together. For example, anything under <facility> gets concatenated into a single string. I can go into the subnodes and put them one by one into a dataframe.

    0 讨论(0)
  • 2021-02-11 00:28

    I don't understand why do you not use again xpathSApply, to retrieve locations as you already did for titles?!

    xpathSApply(xml_doc, "//clinical_study/location" , xmlValue)
    
    0 讨论(0)
提交回复
热议问题