问题
Currently scraping a page which can have a variable amount of elements and also sometimes formats the same data elements with different selectors. Currently trying to ignore errors thrown by RSelenium with some tryCatch code but still stops when the specified element is not on the page:
result <- tryCatch({
webElem <- remDr$findElement('xpath', "//tr[(((count(preceding-sibling::*) + 1) = 9) and parent::*)]//span[(((count(preceding-sibling::*) + 1) = 1) and parent::*)]")
}, warning = function(cond) {
rank1_US <- NA
},
error = function(cond) {
rank1_US <- NA
}, finally = {
rank1_US <- webElem$getElementText() %>% unlist(.) %>% ifelse(length(.) == 0, NA, .)
})
Which runs into an error when the particular element can't be found on the page:
Selenium message:Unable to locate element: //tr[(((count(preceding-sibling::*) + 1) = 9) and parent::*)]//span[(((count(preceding-sibling::*) + 1) = 1) and parent::*)]
For documentation on this error, please visit: http://seleniumhq.org/exceptions/no_such_element.html
Build info: version: '3.0.1', revision: '1969d75', time: '2016-10-18 09:48:19 -0700'
System info: host: '462a81a34fb2', ip: '172.17.0.2', os.name: 'Linux', os.arch: 'amd64', os.version: '4.4.39-boot2docker', java.version: '1.8.0_111'
Driver info: org.openqa.selenium.firefox.FirefoxDriver
Capabilities [{rotatable=false, raisesAccessibilityExceptions=false, firefoxOptions={args=[], profile=UEsDBBQACAgIAIc7I0oAAAAAAAAAAAAAAAAHAAAAdXNlci5qc51WTW/bMAy971cMOW3AKqTretlOXdcBA4Z1aFDsKMgSbauRJU0fcfPvR/mjSRNHbndKbJMS+fj4yOjBUeugfLconGnxiXhWQvdf6oo0TLXMAQHNCgVi8eFtyZSH91/exJ2nYAFtrHEhudTAVKj7Z4JGG8ln/DWE1rg1qUOwxNbS19uz9Nky788U6CrU6Pjx8vK52xiwAybwR0AAHkB8l86HK4yFK0C34OJhuKbBvB4pr51pgHrupA3URU2DbJLLxXL6osAKTxAOfauvlfEwnc1oLUyrlWEC79KsSsDWpv1Tg14hWgmpaXeLQdng02W0MYKpGexhE4xRnoBzxnGjvVH7cB+n72WljUbUGmgKcKvu0edz8eC9RKtgkAsOfETcSgyUcsd8nfdVUq+JsaApPAZwmqlUzFczqExlvYt6+rIWCuHkBp8Z54DljBoz90gHysEFP4nEU6Wkt4ptQdycL1e/DDInlfbTtDG+Erf6j9RYX3++JBIvMvd3P9FjwQoTw+dCMb1eHHOuT4gypeiDRzBSnLKH/ji2RysRbrQlbS0DKOkDHvA3SneKCYkGaxnI0E0j6zC5xIUsAIphYbAB8lTzibfRkhq7xuLZtAXFUwdFl0q6OEg5VVt3rCHRYoGBaIS23N6jyat1JNrUSjfZ8IBHJ8MWmaIA/xEfnOSBGicrqak1SvJtnqoaWmw7MuSTqeazQPuTSXq5ikUju1b53b286sj4Mt2cPCab8VN3DoVJRUHLE+q160NMs+34e9yIpizRDs6YtZ4g+0xLiy0VULKowrScjLBzbwf+TEd7zIcs20Y6I/dRqYLbkl4ZO/uPc7bZo0dEbu5/XpELwn... <truncated>
Session ID: c7f538b9-b516-411d-ace9-a980353ae442
*** Element info: {Using=xpath, value=//tr[(((count(preceding-sibling::*) + 1) = 9) and parent::*)]//span[(((count(preceding-sibling::*) + 1) = 1) and parent::*)]}
Anyone have a way to handle the error?
回答1:
You can try something like:
result <- tryCatch({
suppressMessages({
webElem <- remDr$findElement('xpath', "//tr[(((count(preceding-sibling::*) + 1) = 9) and parent::*)]//span[(((count(preceding-sibling::*) + 1) = 1) and parent::*)]")
rank1_US <- webElem$getElementText() %>% unlist(.) %>% ifelse(length(.) == 0, NA, .)
rank1_US
})
},
error = function(e) {
NA_character_
}
)
来源:https://stackoverflow.com/questions/41449082/handle-rselenium-error-messages