How to download and save a file from Internet using Java?

前端 未结 21 2933
情深已故
情深已故 2020-11-21 05:06

There is an online file (such as http://www.example.com/information.asp) I need to grab and save to a directory. I know there are several methods for grabbing a

相关标签:
21条回答
  • 2020-11-21 05:46

    This answer is almost exactly like selected answer but with two enhancements: it's a method and it closes out the FileOutputStream object:

        public static void downloadFileFromURL(String urlString, File destination) {    
            try {
                URL website = new URL(urlString);
                ReadableByteChannel rbc;
                rbc = Channels.newChannel(website.openStream());
                FileOutputStream fos = new FileOutputStream(destination);
                fos.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE);
                fos.close();
                rbc.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    
    0 讨论(0)
  • 2020-11-21 05:46

    There is an issue with simple usage of:

    org.apache.commons.io.FileUtils.copyURLToFile(URL, File) 
    

    if you need to download and save very large files, or in general if you need automatic retries in case connection is dropped.

    What I suggest in such cases is Apache HttpClient along with org.apache.commons.io.FileUtils. For example:

    GetMethod method = new GetMethod(resource_url);
    try {
        int statusCode = client.executeMethod(method);
        if (statusCode != HttpStatus.SC_OK) {
            logger.error("Get method failed: " + method.getStatusLine());
        }       
        org.apache.commons.io.FileUtils.copyInputStreamToFile(
            method.getResponseBodyAsStream(), new File(resource_file));
        } catch (HttpException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
        method.releaseConnection();
    }
    
    0 讨论(0)
  • 2020-11-21 05:47

    1st Method using the new channel

    ReadableByteChannel aq = Channels.newChannel(new url("https//asd/abc.txt").openStream());
    FileOutputStream fileOS = new FileOutputStream("C:Users/local/abc.txt")
    FileChannel writech = fileOS.getChannel();
    

    2nd Method using FileUtils

    FileUtils.copyURLToFile(new url("https//asd/abc.txt",new local file on system("C":/Users/system/abc.txt"));
    

    3rd Method using

    InputStream xy = new ("https//asd/abc.txt").openStream();
    

    This is how we can download file by using basic java code and other third-party libraries. These are just for quick reference. Please google with the above keywords to get detailed information and other options.

    0 讨论(0)
  • 2020-11-21 05:48

    To summarize (and somehow polish and update) previous answers. The three following methods are practically equivalent. (I added explicit timeouts because I think they are a must, nobody wants a download to freeze forever when the connection is lost.)

    public static void saveUrl1(final Path file, final URL url,
       int secsConnectTimeout, int secsReadTimeout)) 
        throws MalformedURLException, IOException {
        // Files.createDirectories(file.getParent()); // optional, make sure parent dir exists
        try (BufferedInputStream in = new BufferedInputStream(
           streamFromUrl(url, secsConnectTimeout,secsReadTimeout)  );
            OutputStream fout = Files.newOutputStream(file)) {
            final byte data[] = new byte[8192];
            int count;
            while((count = in.read(data)) > 0)
                fout.write(data, 0, count);
        }
    }
    
    public static void saveUrl2(final Path file, final URL url,
       int secsConnectTimeout, int secsReadTimeout))  
        throws MalformedURLException, IOException {
        // Files.createDirectories(file.getParent()); // optional, make sure parent dir exists
        try (ReadableByteChannel rbc = Channels.newChannel(
          streamFromUrl(url, secsConnectTimeout,secsReadTimeout) 
            );
            FileChannel channel = FileChannel.open(file,
                 StandardOpenOption.CREATE, 
                 StandardOpenOption.TRUNCATE_EXISTING,
                 StandardOpenOption.WRITE) 
            ) {
            channel.transferFrom(rbc, 0, Long.MAX_VALUE);
        }
    }
    
    public static void saveUrl3(final Path file, final URL url, 
       int secsConnectTimeout, int secsReadTimeout))  
        throws MalformedURLException, IOException {
        // Files.createDirectories(file.getParent()); // optional, make sure parent dir exists
        try (InputStream in = streamFromUrl(url, secsConnectTimeout,secsReadTimeout) ) {
            Files.copy(in, file, StandardCopyOption.REPLACE_EXISTING);
        }
    }
    
    public static InputStream streamFromUrl(URL url,int secsConnectTimeout,int secsReadTimeout) throws IOException {
        URLConnection conn = url.openConnection();
        if(secsConnectTimeout>0) conn.setConnectTimeout(secsConnectTimeout*1000);
        if(secsReadTimeout>0) conn.setReadTimeout(secsReadTimeout*1000);
        return conn.getInputStream();
    }
    

    I don't find significant differences, all seem right to me. They are safe and efficient. (Differences in speed seem hardly relevant - I write 180Mb from local server to a SSD disk in times that fluctuate around 1.2 to 1.5 segs). They don't require external libraries. All work with arbitrary sizes and (to my experience) HTTP redirections.

    Additionally, all throw FileNotFoundException if the resource is not found (error 404, typically), and java.net.UnknownHostException if the DNS resolution failed; other IOException correspond to errors during transmission.

    (Marked as community wiki, feel free to add info or corrections)

    0 讨论(0)
  • 2020-11-21 05:50

    It's an old question but here is a concise, readable, JDK-only solution with properly closed resources:

    import java.io.InputStream;
    import java.net.URI;
    import java.nio.file.Files; 
    import java.nio.file.Paths;
    
    // ...
    
    public static void download(String url, String fileName) throws Exception {
        try (InputStream in = URI.create(url).toURL().openStream()) {
            Files.copy(in, Paths.get(fileName));
        }
    }
    

    Two lines of code and no dependencies.

    0 讨论(0)
  • 2020-11-21 05:52

    Give Java NIO a try:

    URL website = new URL("http://www.website.com/information.asp");
    ReadableByteChannel rbc = Channels.newChannel(website.openStream());
    FileOutputStream fos = new FileOutputStream("information.html");
    fos.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE);
    

    Using transferFrom() is potentially much more efficient than a simple loop that reads from the source channel and writes to this channel. Many operating systems can transfer bytes directly from the source channel into the filesystem cache without actually copying them.

    Check more about it here.

    Note: The third parameter in transferFrom is the maximum number of bytes to transfer. Integer.MAX_VALUE will transfer at most 2^31 bytes, Long.MAX_VALUE will allow at most 2^63 bytes (larger than any file in existence).

    0 讨论(0)
提交回复
热议问题