how to create an R data frame from a xml file

前端 未结 1 1107
悲&欢浪女
悲&欢浪女 2020-12-09 20:42

I have a XML Document file. The part of the file looks like this:

-  
     COUNTY  
     County a         


        
相关标签:
1条回答
  • 2020-12-09 21:32

    Assuming this is the correct taxlots.shp.xml file:

    <attr>  
         <attrlabl>COUNTY</attrlabl>  
         <attrdef>County abbreviation</attrdef>  
         <attrtype>Text</attrtype>  
         <attwidth>1</attwidth>  
         <atnumdec>0</atnumdec>  
        <attrdomv>  
            <edom>  
                <edomv>C</edomv>  
                <edomvd>Clackamas County</edomvd>  
                <edomvds/>  
             </edom>  
            <edom>  
                <edomv>M</edomv>  
                <edomvd>Multnomah County</edomvd>  
                <edomvds/>  
             </edom>  
            <edom>  
                <edomv>W</edomv>  
                <edomvd>Washington County</edomvd>  
                <edomvds/>  
             </edom>  
         </attrdomv>  
     </attr>
    

    You were almost there:

    doc <- xmlParse("taxlots.shp.xml")
    xmlToDataFrame(nodes=getNodeSet(doc1,"//attr"))[c("attrlabl","attrdef","attrtype","attrdomv")]
      attrlabl             attrdef attrtype                                             attrdomv
    1   COUNTY County abbreviation     Text CClackamas CountyMMultnomah CountyWWashington County
    

    But the last field has not the format you wanted. To do so, require some additional steps:

    step1 <- xmlToDataFrame(nodes=getNodeSet(doc1,"//attrdomv/edom"))
    step1
      edomv            edomvd edomvds
    1     C  Clackamas County        
    2     M  Multnomah County        
    3     W Washington County  
    
    step2 <- paste(paste(step1$edomv, step1$edomvd, sep=" "), collapse="; ")
    step2
    [1] "C Clackamas County; M Multnomah County; W Washington County"
    
    cbind(xmlToDataFrame(nodes= getNodeSet(doc1, "//attr"))[c("attrlabl", "attrdef", "attrtype")],
          attrdomv= step2)
      attrlabl             attrdef attrtype                                                      attrdomv
    1   COUNTY County abbreviation     Text C Clackamas County; M Multnomah County; W Washington County
    
    0 讨论(0)
提交回复
热议问题