Getting RSelenium Error: “Failed to decode response from marionette”

匿名 (未验证) 提交于 2019-12-03 01:25:01

问题:

I'm relatively new to R (and brand spanking new to scraping with R), so apologies in advance if I'm overlooking something obvious here!

I've been trying to learn how to scrape with RSelenium by following this tutorial: https://rawgit.com/petrkeil/Blog/master/2017_08_15_Web_scraping/web_scraping.html#advanced-scraping-with-rselenium

After running the following in Terminal (docker run -d -p 4445:4444 selenium/standalone-firefox), I tried to run the R code below, pulled with only slight modifications from the tutorial hyperlinked above:

get.tree <- function(genus, species)  {   # navigate to the page   browser <- remoteDriver(port=4445L)   browser$open(silent = T)    browser$navigate("http://www.bgci.org/global_tree_search.php?sec=globaltreesearch")   browser$refresh()    # create r objects from the web search input and button elements    genusElem <- browser$findElement(using = 'id', value = "genus-field")   specElem <- browser$findElement(using = 'id', value = "species-field")   buttonElem <- browser$fiendElement(using = 'class', value = "btn_ohoDO")    # tell R to fill in the fields    genusElem$sendKeysToElement(list(genus))   specElem$sendKeysToElement(list(species))    # tell R to click the search button    buttonElem$clickElement()    # get output    out <- browser$findElement(using = "css", value = "td.cell_1O3UaG:nth-child(4)") # the country origin   out <- out$getElementText()[[1]] # extract actual text string   out <- strsplit(out, split = "; ")[[1]] # turns into character vector    # close browser    browser$close()      return(out) }  # Now let's try it:  get.tree("Abies", "alba") 

But after doing all that, I get the following error:

Selenium message:Failed to decode response from marionette Build info: version: '3.6.0', revision: '6fbf3ec767', time: '2017-09-27T16:15:40.131Z' System info: host: 'd260fa60d69b', ip: '172.17.0.2', os.name: 'Linux', os.arch: 'amd64', os.version: '4.9.49-moby', java.version: '1.8.0_131' Driver info: driver.version: unknown

Error: Summary: UnknownError Detail: An unknown server-side error occurred while processing the command. class: org.openqa.selenium.WebDriverException Further Details: run errorDetails method

Anyone have any idea what this means and where I went wrong?

Thanks very much for your help!

回答1:

Just take advantage of the XHR request it makes to retrieve the in-line results and toss RSelenium:

library(httr) library(tidyverse)  get_tree <-  function(genus, species) {    GET(     url = sprintf("https://data.bgci.org/treesearch/genus/%s/species/%s", genus, species),      add_headers(       Origin = "http://www.bgci.org",        Referer = "http://www.bgci.org/global_tree_search.php?sec=globaltreesearch"     )   ) -> res    stop_for_status(res)    matches <- content(res, flatten=TRUE)$results[[1]]    flatten_df(matches[c("id", "taxon", "family", "author", "source", "problems", "distributionDone", "note", "wcsp")]) %>%      mutate(geo = list(map_chr(matches$TSGeolinks, "country"))) %>%      mutate(taxas = list(map_chr(matches$TSTaxas, "checkTaxon")))  }  xdf <- get_tree("Abies", "alba")  xdf ## # A tibble: 1 x 8 ##      id      taxon   family author     source distributionDone        geo      taxas ##   <int>      <chr>    <chr>  <chr>      <chr>            <chr>     <list>     <list> ## 1 58373 Abies alba Pinaceae  Mill. WCSP Phans              yes <chr [21]> <chr [45]>  glimpse(xdf) ## Observations: 1 ## Variables: 8 ## $ id               <int> 58373 ## $ taxon            <chr> "Abies alba" ## $ family           <chr> "Pinaceae" ## $ author           <chr> "Mill." ## $ source           <chr> "WCSP Phans" ## $ distributionDone <chr> "yes" ## $ geo              <list> [<"Albania", "Andorra", "Austria", "Bulgaria", "Croatia", "Czech Republic", "Fr... ## $ taxas            <list> [<"Abies abies", "Abies alba f. columnaris", "Abies alba f. compacta", "Abies a... 

It's highly likely you'll need to modify get_tree() at some point but it's better than having Selenium or Splash or phantomjs or Headless Chrome as a dependency.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!