R rvest retrieve empty table

喜欢而已 提交于 2021-01-27 21:53:27

问题


I'm trying two strategies to get data from a web table:

library(tidyverse)  
library(rvest)                        

webpage <- read_html('https://markets.cboe.com/us/equities/market_statistics/book/')
data <- html_table(webpage, fill=TRUE)
data[[2]]

''

library("httr")
library("XML")

URL <- 'https://markets.cboe.com/us/equities/market_statistics/book/'
temp <- tempfile(fileext = ".html")
GET(url = URL, user_agent("Mozilla/5.0"), write_disk(temp))

df <- readHTMLTable(temp)
df <- df[[2]]

Both of them are returning an empty table.


回答1:


Values are retrieved dynamically from another endpoint you can find in the network tab when refreshing your url. You need to add a referer header for the server to return json containing the table data.

library(httr)

headers = c('Referer'='https://markets.cboe.com/us/equities/market_statistics/book/')
d <- content(httr::GET('https://markets.cboe.com/json/bzx/book/FIT', httr::add_headers(.headers=headers)))
print(d$data)


来源:https://stackoverflow.com/questions/58684947/r-rvest-retrieve-empty-table

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!