archive.org Wayback Machine API multiple URLs at one Request

柔情痞子 提交于 2021-01-28 04:42:08

问题


Is somebody aware of how to pass multiple urls in one request to the Wayback Machine API ? Or is it even possible to do that?

I looked for it all over the internet but I didn't find anything about how to do it.


回答1:


one url, one request, one answer - but a list of urls can be checked within the loop; for ex. in R:

urls <- c("http://onet.pl","http://wired.com","http://geocities.com")

ask_wm_api <- function(urls) {
  library(jsonlite)
  df <- data.frame()
  for(u in urls) {
    x <- fromJSON(paste0("http://archive.org/wayback/available?url=",u))
    df <- rbind(df, as.data.frame(x))
  }
  return(df)
}

r <- ask_wm_api(urls)

in effect: a data frame which can be easily export to csv:

or passed in R:

r$archived_snapshots.closest.url
[1] http://web.archive.org/web/20180511050915/https://www.onet.pl/  
[2] http://web.archive.org/web/20180510143400/https://www.wired.com/
[3] http://web.archive.org/web/20180511013018/http://@geocities.com/
3 Levels: http://web.archive.org/web/20180511050915/https://www.onet.pl/ ...

Want more data? Try to use Wayback CDX Server API



来源:https://stackoverflow.com/questions/31286096/archive-org-wayback-machine-api-multiple-urls-at-one-request

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!