Skip errors in R for loops and also pause the process in each iteration

假如想象 提交于 2020-01-06 06:52:26

问题


I two questions regarding loops in R.

1) I'm using XML package to scrap some tables from the website and combine them using rbind. I'm using following command and it is working without issues if price data and tables are present in the given websites.

url.list <- c("www1", "www2", "www3")

for(url_var in url.list)
{
  url <- url_var
  url.parsed <- htmlParse(getURL(url), asText = TRUE)
  tableNodes <- getNodeSet(url.parsed, '//*[@id="table"]/table')
  newdata <- readHTMLTable(tableNodes[[1]], header=F, stringsAsFactors=F)
  big.data <- rbind(newdata,  big.data)
  Sys.sleep(30)
}

But sometimes web page does not have corresponding table (in this case I'm left with one variable table with the message: No current prices reported.) and my loop stops with following error message (since number of table columns do not match):

 Error in rbind(deparse.level, ...) : 
  numbers of columns of arguments do not match 

I want R to ignore the error and go ahead with the next web page (skipping the one that has different number of columns).

2) In the end of the loop I have Sys.sleep(30). Does it force R to wait 30 seconds before it tries next web page.

Thank you


回答1:


As @RuiBarradas Mentioned in the comment, tryCatch is the way we handle errors (or even warnings) in R. Specifically in your case, what you need is going to next iteration when there are errors, So you can do like:

for (url_var in url.list) {
    url <- url_var
    url.parsed <- htmlParse(getURL(url), asText = TRUE)
    tryCatch({
        # Try to run the code within these braces
        tableNodes <- getNodeSet(url.parsed, '//*[@id="table"]/table')
        newdata <- readHTMLTable(tableNodes[[1]], header=F, stringsAsFactors=F)
        big.data <- rbind(newdata,  big.data)
    },
        # If there are errors, go to next iteration
        # Sys.sleep(30) won't be executed in such case
        error = next())
    Sys.sleep(30)
}

And yes, Sys.sleep(30) makes R sleep for 30 seconds when it is executed. Thus, if you want R to always sleep in every iteration no matter the parsing is successful or not, you may consider moving that line in front of tryCatch.

See the well-written answer in How to write trycatch in R for more detailed elaboration of tryCatch.



来源:https://stackoverflow.com/questions/49834055/skip-errors-in-r-for-loops-and-also-pause-the-process-in-each-iteration

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!