问题
I am trying to scrape the table from this URL: "https://hutdb.net/17/players" I have spent a lot of time learning rvest and using selectorgadget, however whenever I try to get an output I always get the same error (Character(0)).
library(rvest)
library(magrittr)
url <- read_html("https://hutdb.net/17/players")
table <- url %>%
html_nodes("td") %>%
html_text()
Any help would be appreciated.
回答1:
The data is dynamically loaded, and cannot be retrieved directly from the html. But, looking at "Network" in Chrome DevTools for instance, we can find a nicely formatted JSON at https://hutdb.net/ajax/stats.php?year=17&page=0&selected=OVR&sort=DESC
library(jsonlite)
dat <- fromJSON("https://hutdb.net/ajax/stats.php?year=17&page=0&selected=OVR&sort=DESC")
Output looks like:
# results aOVR id League Year Card Team Player Position Type Shoots HGT
# 1 6308 6308 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 2 <NA> 2030 11782 NHL 17 MOV OTT Erik Karlsson RD OFD Right 6'0
# 3 <NA> 2060 11785 NHL 17 MOV TBL Victor Hedman LD TWD Left 6'6
# 4 <NA> 2008 11791 NHL 17 MOV CHI Patrick Kane RW SNP Left 5'11
# 5 <NA> 2058 13845 NHL 17 SCE ANA Ryan Getzlaf C PWF Right 6'4
# 6 <NA> 2074 11824 NHL 17 MOV BOS Brad Marchand LW TWF Left 5'9
# 7 <NA> 2008 11829 NHL 17 MOV EDM Connor McDavid C PLY Left 6'2
# 8 <NA> 2048 11840 NHL 17 MOV WSH Nicklas Backstrom C PLY Left 6'1
# 9 <NA> 2058 11841 NHL 17 MOV PIT Sidney Crosby C PLY Left 5'11
# 10 <NA> 2065 13644 NHL 17 TOTY WPG Patrik Laine RW TWF Right 6'3
# 11 <NA> 2008 13645 NHL 17 TOTY EDM Connor McDavid C PLY Left 6'2
# 12 <NA> 2039 13680 NHL 17 TOTY LAK Drew Doughty RD TWD Right 6'1
# 13 <NA> 2063 13689 NHL 17 TOTY BOS Patrice Bergeron C TWF Right 6'2
来源:https://stackoverflow.com/questions/44002324/scraping-a-table-from-a-website-using-r-rvest-or-vba-if-possible