is there a better way to do this XML scraping task in R?

前端 未结 1 858
鱼传尺愫
鱼传尺愫 2020-12-12 01:26

I have some XML that looks like:




        
相关标签:
1条回答
  • 2020-12-12 01:47

    After fixing the xml...

    The way to traverse this with xpath is:

    > plist = xmlParse('data.xml')
    > xpathSApply(plist, '/plist/array/dict/dict/string', xmlValue)
    [1] "-27.45433"                           "153.01474"                          
    [3] "-27.45706"                           "153.01239"                          
    [5] "university"                          "Queensland University of Technology"
    [7] "way"                                 "26303436" 
    

    The output you can index like normal.

    However, if the nodes had attributes, for example <string type='uniname">...</string>, then you could have used the nice "@" syntax like:

    > xpathSApply(plist, '/plist/array/dict/dict/string[@type='uniname']', xmlValue)
    

    Another way, which might be better for this plist formatting, is:

    > sapply(getNodeSet(plist, '//key[text() = "name"]'), function(x) xmlValue(getSibling(x)))
    [1] "Queensland University of Technology"
    
    0 讨论(0)
提交回复
热议问题