问题
can someone help me or give me some suggestion how scrape table from this url: https://www.promet.si/portal/sl/stevci-prometa.aspx.
I tried with instructions and packages rvest, httr and html but for this particular site without any sucess. Thank you.
回答1:
This ought to help get you started:
library(RSelenium)
library(wdman)
library(seleniumPipes)
library(rvest)
library(tidyverse)
selServ <- selenium(verbose = FALSE)
selServ$log() # find the port
remDr <- remoteDr(browserName = "chrome", port = 4567L)
remDr %>%
go("https://www.promet.si/portal/sl/stevci-prometa.aspx")
Sys.sleep(5)
pg <- getPageSource(remDr)
html_node(pg, xpath=".//div[@id='ctl00_mainContent_ctl00_StvContainer']/table") %>%
html_table() %>%
tbl_df()
## # A tibble: 1,239 x 10
## X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
## <lgl> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <lgl>
## 1 NA Lokacija Cesta Smer Pas Števil… Hitro… Razm… Stanje NA
## 2 NA Ajdovščina R2-444 vzhod - zahod "" 60 64 81,7 Norma… NA
## 3 NA Ajdovščina R2-444 zahod - vzhod "" 12 62 371,6 Norma… NA
## 4 NA Ajdovščina 2 R2-444 Ajdovščina - Selo "" 36 67 117,8 Norma… NA
## 5 NA Ajdovščina 2 R2-444 Ajdovščina - Selo "" 12 60 787,1 Norma… NA
## 6 NA Ajdovščina AC HC-H4 Nova Gorica - Vipava vozni 96 100 31,5 Norma… NA
## 7 NA Ajdovščina AC HC-H4 Nova Gorica - Vipava prehi… 36 124 120,7 Norma… NA
## 8 NA Ankaran R2-406 Križ. Moretini - Ankaran "" 96 59 29 Norma… NA
## 9 NA Ankaran R2-406 Ankaran - Križ. Moretini "" 12 57 292,1 Norma… NA
## 10 NA Apače R2-438 Trate - Gornja Radgona "" 24 58 110,6 Norma… NA
## # ... with 1,229 more rows
回答2:
The translation of right to use of site "Right to use: All information and images contained on the website www.promet.si are subject to copyright protection and other forms of intellectual property protection. The documents published on these web pages may only be reproduced for non-commercial purposes, and they must also retain all the warnings of copyright or other rights. On every reproduction, the "Traffic Information Center for State Roads" should be listed as a source."
I am not sure if that means that scraping for non-commercial purposes is allowed or not.
Anyway thank you for the warning @s_t and special for the answer with nice code @hrbrmstr.
来源:https://stackoverflow.com/questions/52855989/scrape-aspx-page-with-r