问题
I two questions regarding loops in R
.
1) I'm using XML
package to scrap some tables from the website and combine them using rbind
. I'm using following command and it is working without issues if price data and tables are present in the given websites.
url.list <- c("www1", "www2", "www3")
for(url_var in url.list)
{
url <- url_var
url.parsed <- htmlParse(getURL(url), asText = TRUE)
tableNodes <- getNodeSet(url.parsed, '//*[@id="table"]/table')
newdata <- readHTMLTable(tableNodes[[1]], header=F, stringsAsFactors=F)
big.data <- rbind(newdata, big.data)
Sys.sleep(30)
}
But sometimes web page does not have corresponding table (in this case I'm left with one variable table with the message: No current prices reported.
) and my loop stops with following error message (since number of table columns do not match):
Error in rbind(deparse.level, ...) :
numbers of columns of arguments do not match
I want R
to ignore the error and go ahead with the next web page (skipping the one that has different number of columns).
2) In the end of the loop I have Sys.sleep(30)
. Does it force R
to wait 30 seconds before it tries next web page.
Thank you
回答1:
As @RuiBarradas Mentioned in the comment, tryCatch
is the way we handle errors (or even warnings) in R. Specifically in your case, what you need is going to next iteration when there are errors, So you can do like:
for (url_var in url.list) {
url <- url_var
url.parsed <- htmlParse(getURL(url), asText = TRUE)
tryCatch({
# Try to run the code within these braces
tableNodes <- getNodeSet(url.parsed, '//*[@id="table"]/table')
newdata <- readHTMLTable(tableNodes[[1]], header=F, stringsAsFactors=F)
big.data <- rbind(newdata, big.data)
},
# If there are errors, go to next iteration
# Sys.sleep(30) won't be executed in such case
error = next())
Sys.sleep(30)
}
And yes, Sys.sleep(30)
makes R sleep for 30 seconds when it is executed. Thus, if you want R to always sleep in every iteration no matter the parsing is successful or not, you may consider moving that line in front of tryCatch
.
See the well-written answer in How to write trycatch in R for more detailed elaboration of tryCatch
.
来源:https://stackoverflow.com/questions/49834055/skip-errors-in-r-for-loops-and-also-pause-the-process-in-each-iteration