Scraping tables on multiple web pages with rvest in R

前端 未结 1 869
耶瑟儿~
耶瑟儿~ 2021-01-03 09:43

I am new to web scraping and am trying to scrape tables on multiple web pages. Here is the site: http://www.baseball-reference.com/teams/MIL/2016.shtml

I am able to

相关标签:
1条回答
  • 2021-01-03 10:17

    One way would be to make vector of all the urls you are interested in and then use sapply:

    library(rvest)
    
    years <- 1970:2016
    urls <- paste0("http://www.baseball-reference.com/teams/MIL/", years, ".shtml")
    # head(urls)
    
    get_table <- function(url) {
      url %>%
        read_html() %>%
        html_nodes(xpath = '//*[@id="div_team_batting"]/table[1]') %>% 
        html_table()
    }
    
    results <- sapply(urls, get_table)
    

    results should be a list of 47 data.frame objects; each should be named with the url (i.e., year) they represent. That is, results[1] corresponds to 1970, and results[47] corresponds to 2016.

    0 讨论(0)
提交回复
热议问题