How to download and save a file from Internet using Java?

前端 未结 21 2934
情深已故
情深已故 2020-11-21 05:06

There is an online file (such as http://www.example.com/information.asp) I need to grab and save to a directory. I know there are several methods for grabbing a

相关标签:
21条回答
  • 2020-11-21 05:38

    Use apache commons-io, just one line code:

    FileUtils.copyURLToFile(URL, File)
    
    0 讨论(0)
  • 2020-11-21 05:38

    When using Java 7+ use the following method to download a file from the Internet and save it to some directory:

    private static Path download(String sourceURL, String targetDirectory) throws IOException
    {
        URL url = new URL(sourceURL);
        String fileName = sourceURL.substring(sourceURL.lastIndexOf('/') + 1, sourceURL.length());
        Path targetPath = new File(targetDirectory + File.separator + fileName).toPath();
        Files.copy(url.openStream(), targetPath, StandardCopyOption.REPLACE_EXISTING);
    
        return targetPath;
    }
    

    Documentation here.

    0 讨论(0)
  • 2020-11-21 05:38

    There is method U.fetch(url) in underscore-java library.

    pom.xml:

      <groupId>com.github.javadev</groupId>
      <artifactId>underscore</artifactId>
      <version>1.45</version>
    

    Code example:

    import com.github.underscore.lodash.U;
    
    public class Download {
        public static void main(String ... args) {
            String text = U.fetch("https://stackoverflow.com/questions"
            + "/921262/how-to-download-and-save-a-file-from-internet-using-java").text();
        }
    }
    
    0 讨论(0)
  • 2020-11-21 05:39

    Simpler nio usage:

    URL website = new URL("http://www.website.com/information.asp");
    try (InputStream in = website.openStream()) {
        Files.copy(in, target, StandardCopyOption.REPLACE_EXISTING);
    }
    
    0 讨论(0)
  • 2020-11-21 05:39

    There are many elegant and efficient answers here. But the conciseness can make us lose some useful information. In particular, one often does not want to consider a connection error an Exception, and one might want to treat differently some kind of network-related errors - for example, to decide if we should retry the download.

    Here's a method that does not throw Exceptions for network errors (only for truly exceptional problems, as malformed url or problems writing to the file)

    /**
     * Downloads from a (http/https) URL and saves to a file. 
     * Does not consider a connection error an Exception. Instead it returns:
     *  
     *    0=ok  
     *    1=connection interrupted, timeout (but something was read)
     *    2=not found (FileNotFoundException) (404) 
     *    3=server error (500...) 
     *    4=could not connect: connection timeout (no internet?) java.net.SocketTimeoutException
     *    5=could not connect: (server down?) java.net.ConnectException
     *    6=could not resolve host (bad host, or no internet - no dns)
     * 
     * @param file File to write. Parent directory will be created if necessary
     * @param url  http/https url to connect
     * @param secsConnectTimeout Seconds to wait for connection establishment
     * @param secsReadTimeout Read timeout in seconds - trasmission will abort if it freezes more than this 
     * @return See above
     * @throws IOException Only if URL is malformed or if could not create the file
     */
    public static int saveUrl(final Path file, final URL url, 
      int secsConnectTimeout, int secsReadTimeout) throws IOException {
        Files.createDirectories(file.getParent()); // make sure parent dir exists , this can throw exception
        URLConnection conn = url.openConnection(); // can throw exception if bad url
        if( secsConnectTimeout > 0 ) conn.setConnectTimeout(secsConnectTimeout * 1000);
        if( secsReadTimeout > 0 ) conn.setReadTimeout(secsReadTimeout * 1000);
        int ret = 0;
        boolean somethingRead = false;
        try (InputStream is = conn.getInputStream()) {
            try (BufferedInputStream in = new BufferedInputStream(is); OutputStream fout = Files
                    .newOutputStream(file)) {
                final byte data[] = new byte[8192];
                int count;
                while((count = in.read(data)) > 0) {
                    somethingRead = true;
                    fout.write(data, 0, count);
                }
            }
        } catch(java.io.IOException e) { 
            int httpcode = 999;
            try {
                httpcode = ((HttpURLConnection) conn).getResponseCode();
            } catch(Exception ee) {}
            if( somethingRead && e instanceof java.net.SocketTimeoutException ) ret = 1;
            else if( e instanceof FileNotFoundException && httpcode >= 400 && httpcode < 500 ) ret = 2; 
            else if( httpcode >= 400 && httpcode < 600 ) ret = 3; 
            else if( e instanceof java.net.SocketTimeoutException ) ret = 4; 
            else if( e instanceof java.net.ConnectException ) ret = 5; 
            else if( e instanceof java.net.UnknownHostException ) ret = 6;  
            else throw e;
        }
        return ret;
    }
    
    0 讨论(0)
  • 2020-11-21 05:42

    It's possible to download the file with with Apache's HttpComponents instead of Commons-IO. This code allows you to download a file in Java according to its URL and save it at the specific destination.

    public static boolean saveFile(URL fileURL, String fileSavePath) {
    
        boolean isSucceed = true;
    
        CloseableHttpClient httpClient = HttpClients.createDefault();
    
        HttpGet httpGet = new HttpGet(fileURL.toString());
        httpGet.addHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0");
        httpGet.addHeader("Referer", "https://www.google.com");
    
        try {
            CloseableHttpResponse httpResponse = httpClient.execute(httpGet);
            HttpEntity fileEntity = httpResponse.getEntity();
    
            if (fileEntity != null) {
                FileUtils.copyInputStreamToFile(fileEntity.getContent(), new File(fileSavePath));
            }
    
        } catch (IOException e) {
            isSucceed = false;
        }
    
        httpGet.releaseConnection();
    
        return isSucceed;
    }
    

    In contrast to the single line of code:

    FileUtils.copyURLToFile(fileURL, new File(fileSavePath),
                            URLS_FETCH_TIMEOUT, URLS_FETCH_TIMEOUT);
    

    this code will give you more control over a process and let you specify not only time outs but User-Agent and Referer values, which are critical for many web-sites.

    0 讨论(0)
提交回复
热议问题