Using R to scrape the link address of a downloadable file from a web page?

前端 未结 1 1671
孤独总比滥情好
孤独总比滥情好 2021-02-08 00:33

I\'m trying to automate a process that involves downloading .zip files from a couple of web pages and extracting the .csvs they contain. The challenge is that the .zip file name

相关标签:
1条回答
  • 2021-02-08 01:24

    I think you're trying to do too much in a single xpath expression - I'd attack the problem in a sequence of smaller steps:

    library(rvest)
    library(stringr)
    page <- html("http://www.acleddata.com/data/realtime-data-2015/")
    
    page %>%
      html_nodes("a") %>%       # find all links
      html_attr("href") %>%     # get the url
      str_subset("\\.xlsx") %>% # find those that end in xlsx
      .[[1]]                    # look at the first one
    
    0 讨论(0)
提交回复
热议问题