failed - “cannot open the connection”

后端 未结 1 1292
误落风尘
误落风尘 2021-01-27 07:14

I am trying to read several url files. Does anyone know how to check first if it can open the url and then do something? Sometimes I am getting error (failed=\"cannot open the c

相关标签:
1条回答
  • 2021-01-27 07:54

    You can use tryCatch which returns the value of the expression if it succeeds, and the value of the error argument if there is an error. See ?tryCatch.

    This example looks up a bunch of URLs and downloads them. The tryCatch will return the result of readlines if it is successful, and NULL if not. If the result is NULL we just next() to the next part of the loop.

    urls <- c('http://google.com', 'http://nonexistent.jfkldasf', 'http://stackoverflow.com')
    
    for (u in urls) {
        # I only put warn=F to avoid the "incomplete final line" warning
        # you put read.fwf or whatever here.
        tmp <- tryCatch(readLines(url(u), warn=F),
                        error = function (e) NULL)
        if (is.null(tmp)) {
            # you might want to put some informative message here.
            next() # skip to the next url.
        }
    }
    

    Note this will do so on any error, not just a "404 not found"-type error. If I typo'd and wrote tryCatch(raedlines(url(u), warn=F) (typo on readLines) it'd just skip everything as this would also through an error.


    edit re: comments (lapply is being used, where to put data-processing code). Instead of next(), just only do your processing if the read succeeds. Put data-processing code after reading-data code. Try something like:

    lapply(urls,
           function (u) {
               tmp <- tryCatch(read.fwf(...), error = function (e) NULL)
               if (is.null(tmp)) {
                   # read failed
                   return() # or return whatever you want the failure value to be
               }
               # data processing code goes here.
           })
    

    The above returns out of the function (only affects the current element of lapply) if the read fails.

    Or you could invert it and do something like:

    lapply(urls,
           function (u) {
               tmp <- tryCatch(read.fwf(...), error = function (e) NULL)
               if (!is.null(tmp)) {
                   # read succeeded!
                   # data processing code goes here.
               }           
           })
    

    which will do the same (it only does your data processing code if the read succeeded, and otherwise skips that whole block of code and returns NULL).

    0 讨论(0)
提交回复
热议问题