I have a 27MB pdf file which is hosted in web. When I try to open it, it takes times to open it. So Is there any way where I can view this large pdf file a bit fast. I guess
Two possible workaround would be:
If the PDF doesn't usually change you can set an cache expiry
for the resource (set in the response
), so that when clients opens the pdf/view the resource is cached in their browser, will reduce load time on their second visit (depending on how long it expires of course)
Another option, if it is possible, try to load the pdf file asynchronously
. That is load the other contents of your web page first, then your pdf will be loaded after.
Or you can do a combination of both.
To supplement Kurt's comment. I assembled this command line and it seemed to do the trick for generating an optimized PDF file:
gs -q -P- -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -dFastWebView -sOutputFile=optimized.pdf input.pdf
You can check the output with pdfinfo optimized.pdf
where you should see Optimized: yes
in the output
FlexPaper supports splitting large PDF documents into pages so that only the visible pages are downloaded. They have two viewers, one with page turn effects and one as a more classic document reader / viewer with options from free to commercially licensable.
http://flexpaper.devaldi.com/
What you need to do to your PDF is to make them "web optimized". The technically more correct term is to make them "linearized":
pdfopt.ps
which can do this. Simply run:gs -q -dNODISPLAY -P- -dSAFER -dDELAYSAFER -- /path/to/pdfopt.ps input.pdf optimized.pdf
, or if you are on Windows:gswin32.exe -q -dNODISPLAY -P- -dSAFER -dDELAYSAFER -- c:/path/to/pdfopt.ps input.pdf optimized.pdf
Normally pdfopt.ps
should be installed together with your Ghostscript in the installation path's lib/
subdirectory. If not, you can download pdfopt.ps from the Ghostscript Git repository.
Linearization re-organizes the PDF internally, so that (a copy of) its internal ToC of PDF objects (in technical terms: its "xref table") is put close to the beginning of the file (instead of its end), plus some more changes.
That way, a spec-conforming PDF reader will be able to start rendering the first page before the rest of the file has been loaded. It will even be possible to jump to the last page and view it before the middle pages are downloaded, if you are accessing the PDF over the web using HTTP-based protocols. But then, the web server is required to support the HTTP "byte range" requests (otherwise this will not work even for linearized PDFs).
You can read some more details about PDF linearizations in the official PDF-1.7 ISO standard spec, available on the Adobe website
An example of a linearized PDF can be found here
Since release 9.07 of Ghostscript, linearized ("web optimized") PDF output can be generated directly (without the 2-step approach outlined above) by adding the following switch to the commandline:
-dFastWebView=true
Since the pdfopt.ps
file is now redundant, it has been removed from the current Ghostscript source repository.
One option is to use a pdf library like JPedal to render the page images from the PDF on the server side, and then (through AJAX) present the images to the client.