Scraping html tables into R data frames using the XML package

前端 未结 4 514
野的像风
野的像风 2020-11-22 07:17

How do I scrape html tables using the XML package?

Take, for example, this wikipedia page on the Brazilian soccer team. I would like to read it in R and get the \"li

4条回答
  •  囚心锁ツ
    2020-11-22 08:03

    The rvest along with xml2 is another popular package for parsing html web pages.

    library(rvest)
    theurl <- "http://en.wikipedia.org/wiki/Brazil_national_football_team"
    file<-read_html(theurl)
    tables<-html_nodes(file, "table")
    table1 <- html_table(tables[4], fill = TRUE)
    

    The syntax is easier to use than the xml package and for most web pages the package provides all of the options ones needs.

提交回复
热议问题