rvest function html_nodes returns {xml_nodeset (0)}

余生长醉 提交于 2021-02-10 06:13:05

问题


I am trying to scrape data frame the following website

http://stats.nba.com/game/0041700404/playbyplay/

I'd like to create a table that includes the date of the game, the scores throughout the game, and the team names

I am using the following code:

game1 <- read_html("http://stats.nba.com/game/0041700404/playbyplay/")

#Extracts the Date
html_nodes(game1, xpath = '//*[contains(concat( " ", @class, " " ), concat( " ", "game-summary-team--vtm", " " ))]//*[contains(concat( " ", @class, " " ), concat( " ", "game-summary-team__lineup", " " ))]')

#Extracts the Score
html_nodes(game1, xpath = '//*[contains(concat( " ", @class, " " ), concat( " ", "status", " " ))]//*[contains(concat( " ", @class, " " ), concat( " ", "score", " " ))]')

#Extracts the Team names
html_nodes(game1, xpath = '//*[contains(concat( " ", @class, " " ), concat( " ", "game-summary-team__name", " " ))]//a')

Unfortunately, I get the following

{xml_nodeset (0)}
{xml_nodeset (0)}
{xml_nodeset (0)}

I have seen a bunch of questions and answers to this problem but none of them seem to help.


回答1:


Unfortunately, rvest does not play well with dynamically created, JavaScript pages. It works best with static HTML web pages.

I would suggest taking a look at RSelenium. Finally, I got something out of the page using the rsDriver

Code Sample:

library(RSelenium)
rD <- rsDriver() # runs a chrome browser, wait for necessary files to download
remDr <- rD$client
#no need for remDr$open() browser should already be open
remDr$navigate("http://stats.nba.com/game/0041700404/playbyplay/")

teams <- remDr$findElement(using = "xpath", "//span[@class='team-full']")
teams$getElementText()[[1]]
# and so on...

remDr$close()
# stop the selenium server
rD[["server"]]$stop() 
# if user forgets to stop server it will be garbage collected.
rD <- rsDriver()
rm(rD)
gc(rD)

and so on...

PS: I had some trouble to install it on Windows with current R * this worked * How to set up rselenium for R?




回答2:


I had success with the splashr package in R. To install you need docker. Installation instructions are mentioned in the websites listed below

https://cran.r-project.org/web/packages/splashr/vignettes/intro_to_splashr.html

https://docs.docker.com/docker-for-mac/install/#install-and-run-docker-for-mac - how to install and run docker on a mac

https://splash.readthedocs.io/en/stable/install.html - type these codes into the terminal window before using splashr



来源:https://stackoverflow.com/questions/51219793/rvest-function-html-nodes-returns-xml-nodeset-0

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!