xml2

Parsing XML in R: Incorrect namespaces

天涯浪子 提交于 2019-12-05 05:12:58
I have a bunch of XML files and an R script that reads their content into a data frame. However, I got now files which I wanted to parse as usual, but there is something in their namespace definition that doesn't allow me to pick their values normally with XPath expressions. XML files are like this: xml_nons.xml <?xml version="1.0" encoding="UTF-8"?> <XML> <Node> <Name>Name 1</Name> <Title>Title 1</Title> <Date>2015</Date> </Node> </XML> And the other: xml_ns.xml <?xml version="1.0" encoding="UTF-8"?> <XML xmlns="http://www.nonexistingsite.com"> <Node> <Name>Name 2</Name> <Title>Title 2</Title

Python/R: generate dataframe from XML when not all nodes contain all variables?

荒凉一梦 提交于 2019-12-04 20:30:02
问题 Consider the following XML example library(xml2) myxml <- read_xml(' <data> <obs ID="a"> <name> John </name> <hobby> tennis </hobby> <hobby> golf </hobby> <skill> python </skill> </obs> <obs ID="b"> <name> Robert </name> <skill> R </skill> </obs> </data> ') Here I would like to get an (R or Pandas) dataframe from this XML that contains the columns name and hobby . However, as you see, there is an alignment problem because hobby is missing in the second node and John has two hobbies. in R, I

Parsing large and complicated XML file to data.frame

一笑奈何 提交于 2019-12-04 19:27:21
So, I have large XML file with lots of reports. I created data example below to approximately show the size of xml and its structure: x <- "<Report><Agreements><AgreementList /></Agreements><CIP><RecordList><Record><Date>2017-05-26T00:00:00</Date><Grade>2</Grade><ReasonsList><Reason><Code>R</Code><Description>local</Description></Reason></ReasonsList><Score>xxx</Score></Record><Record><Date>2017-04-30T00:00:00</Date><Grade>2</Grade><ReasonsList><Reason><Code>R</Code><Description/></Reason></ReasonsList><Score>xyx</Score></Record></RecordList></CIP><Individual><Contact><Email/></Contact>

R rvest: could not find function “xpath_element”

╄→尐↘猪︶ㄣ 提交于 2019-12-01 04:40:59
I am trying to simply replicate the example of rvest::html_nodes() , yet encounter an error: library(rvest) ateam <- read_html("http://www.boxofficemojo.com/movies/?id=ateam.htm") html_nodes(ateam, "center") Error in do.call(method, list(parsed_selector)) : could not find function "xpath_element" The same happens if I load packages such as httr , xml2 , selectr . I seem to have the latest version of these packages too... In which packages are functions such as xpath_element , xpath_combinedselector located? How do I get it to work? Note that I am running on Ubuntu 16.04, so that code might

R rvest: could not find function “xpath_element”

孤街醉人 提交于 2019-12-01 02:51:04
问题 I am trying to simply replicate the example of rvest::html_nodes() , yet encounter an error: library(rvest) ateam <- read_html("http://www.boxofficemojo.com/movies/?id=ateam.htm") html_nodes(ateam, "center") Error in do.call(method, list(parsed_selector)) : could not find function "xpath_element" The same happens if I load packages such as httr , xml2 , selectr . I seem to have the latest version of these packages too... In which packages are functions such as xpath_element , xpath