Getting RSelenium Error: “Failed to decode response from marionette”

此生再无相见时 提交于 2019-12-11 16:07:42

问题


I'm relatively new to R (and brand spanking new to scraping with R), so apologies in advance if I'm overlooking something obvious here!

I've been trying to learn how to scrape with RSelenium by following this tutorial: https://rawgit.com/petrkeil/Blog/master/2017_08_15_Web_scraping/web_scraping.html#advanced-scraping-with-rselenium

After running the following in Terminal (docker run -d -p 4445:4444 selenium/standalone-firefox), I tried to run the R code below, pulled with only slight modifications from the tutorial hyperlinked above:

get.tree <- function(genus, species) 
{
  # navigate to the page
  browser <- remoteDriver(port=4445L)
  browser$open(silent = T)

  browser$navigate("http://www.bgci.org/global_tree_search.php?sec=globaltreesearch")
  browser$refresh()

  # create r objects from the web search input and button elements

  genusElem <- browser$findElement(using = 'id', value = "genus-field")
  specElem <- browser$findElement(using = 'id', value = "species-field")
  buttonElem <- browser$fiendElement(using = 'class', value = "btn_ohoDO")

  # tell R to fill in the fields

  genusElem$sendKeysToElement(list(genus))
  specElem$sendKeysToElement(list(species))

  # tell R to click the search button

  buttonElem$clickElement()

  # get output

  out <- browser$findElement(using = "css", value = "td.cell_1O3UaG:nth-child(4)") # the country origin
  out <- out$getElementText()[[1]] # extract actual text string
  out <- strsplit(out, split = "; ")[[1]] # turns into character vector

  # close browser

  browser$close()

    return(out)
}

# Now let's try it:

get.tree("Abies", "alba")

But after doing all that, I get the following error:

Selenium message:Failed to decode response from marionette Build info: version: '3.6.0', revision: '6fbf3ec767', time: '2017-09-27T16:15:40.131Z' System info: host: 'd260fa60d69b', ip: '172.17.0.2', os.name: 'Linux', os.arch: 'amd64', os.version: '4.9.49-moby', java.version: '1.8.0_131' Driver info: driver.version: unknown

Error: Summary: UnknownError Detail: An unknown server-side error occurred while processing the command. class: org.openqa.selenium.WebDriverException Further Details: run errorDetails method

Anyone have any idea what this means and where I went wrong?

Thanks very much for your help!


回答1:


Just take advantage of the XHR request it makes to retrieve the in-line results and toss RSelenium:

library(httr)
library(tidyverse)

get_tree <-  function(genus, species) {

  GET(
    url = sprintf("https://data.bgci.org/treesearch/genus/%s/species/%s", genus, species), 
    add_headers(
      Origin = "http://www.bgci.org", 
      Referer = "http://www.bgci.org/global_tree_search.php?sec=globaltreesearch"
    )
  ) -> res

  stop_for_status(res)

  matches <- content(res, flatten=TRUE)$results[[1]]

  flatten_df(matches[c("id", "taxon", "family", "author", "source", "problems", "distributionDone", "note", "wcsp")]) %>% 
    mutate(geo = list(map_chr(matches$TSGeolinks, "country"))) %>% 
    mutate(taxas = list(map_chr(matches$TSTaxas, "checkTaxon")))

}

xdf <- get_tree("Abies", "alba")

xdf
## # A tibble: 1 x 8
##      id      taxon   family author     source distributionDone        geo      taxas
##   <int>      <chr>    <chr>  <chr>      <chr>            <chr>     <list>     <list>
## 1 58373 Abies alba Pinaceae  Mill. WCSP Phans              yes <chr [21]> <chr [45]>

glimpse(xdf)
## Observations: 1
## Variables: 8
## $ id               <int> 58373
## $ taxon            <chr> "Abies alba"
## $ family           <chr> "Pinaceae"
## $ author           <chr> "Mill."
## $ source           <chr> "WCSP Phans"
## $ distributionDone <chr> "yes"
## $ geo              <list> [<"Albania", "Andorra", "Austria", "Bulgaria", "Croatia", "Czech Republic", "Fr...
## $ taxas            <list> [<"Abies abies", "Abies alba f. columnaris", "Abies alba f. compacta", "Abies a...

It's highly likely you'll need to modify get_tree() at some point but it's better than having Selenium or Splash or phantomjs or Headless Chrome as a dependency.



来源:https://stackoverflow.com/questions/47104635/getting-rselenium-error-failed-to-decode-response-from-marionette

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!