How to download a pdf file programmatically from a webpage with .html extension?

做~自己de王妃 提交于 2020-01-19 14:58:48

问题


I have reviewed ALL similar questions (not only this!) on this forum and have tried ALL of those methods however still was not able to programmatically download a test file: http://pdfobject.com/markup/examples/full-browser-window.html

The following is the direct link to the test file that i am trying to download. This is a test pdf file with an open access, so anybody can use it to test a download method.

How can I download this particular file so that it has a pdf extension?


回答1:


For downloading a file, perhaps you could try something like this:

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.net.URL;
import java.net.URLConnection;

public final class FileDownloader {

    private FileDownloader(){}

    public static void main(String args[]) throws IOException{
        download("http://pdfobject.com/pdf/sample.pdf", new File("sample.pdf"));
    }

    public static void download(final String url, final File destination) throws IOException {
        final URLConnection connection = new URL(url).openConnection();
        connection.setConnectTimeout(60000);
        connection.setReadTimeout(60000);
        connection.addRequestProperty("User-Agent", "Mozilla/5.0");
        final FileOutputStream output = new FileOutputStream(destination, false);
        final byte[] buffer = new byte[2048];
        int read;
        final InputStream input = connection.getInputStream();
        while((read = input.read(buffer)) > -1)
            output.write(buffer, 0, read);
        output.flush();
        output.close();
        input.close();
    }
}



回答2:


Let me give you a shorter solution, it comes with a library called JSoup, which BalusC often uses in his answers.

//Get the response
Response response=Jsoup.connect(location).ignoreContentType(true).execute();

//Save the file 
FileOutputStream out = new FileOutputStream(new File(outputFolder + name));
out.write(response.bodyAsBytes());
out.close();

Well, you must have guessed by now, response.body() is where the pdf is. You can download any binary file with this piece of code.



来源:https://stackoverflow.com/questions/19309300/how-to-download-a-pdf-file-programmatically-from-a-webpage-with-html-extension

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!