Statically compile pdftk for Heroku. Need to split PDF into single page files

半城伤御伤魂 提交于 2019-12-03 12:53:48

Unfortunately Heroku keeps stripping out magic to add flexibility. As a result it feels more and more like the days when I used to manage and maintain my own servers. There is no easy solution. My "monkey patch" is to send the file to a server that I can install PDFTK, process the file, and send it back. Not great, but it works. Having to deal with this defeats the purpose of using heroku.

The easy solution is to add the one dependency for pdftk that is not found on heroku.

$ldd pdftk
    linux-vdso.so.1 =>  (0x00007ffff43ca000)
    libgcj.so.10 => not found
    libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007f1d26d48000)
    libm.so.6 => /lib/libm.so.6 (0x00007f1d26ac4000)
    libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007f1d268ad000)
    libc.so.6 => /lib/libc.so.6 (0x00007f1d2652a000)
    libpthread.so.0 => /lib/libpthread.so.0 (0x00007f1d2630c000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f1d27064000)

I put pdftk and libgcj.so.10 into the /bin directory of my app. You then just need to tell heroku to look at the /bin dir when loading libs.

You can type

$heroku config
LD_LIBRARY_PATH:             /app/.heroku/vendor/lib
LIBRARY_PATH:                /app/.heroku/vendor/lib

To see what your current LD_LIBRARY_PATH is set to and then add /app/bin (or whatever dir you chose to store libgcj.so.10) to it.

$heroku config:set LD_LIBRARY_PATH=/app/.heroku/vendor/lib:/app/bin

The down side is that my slug size went from 15.9MB to 27.5MB

We've encountered the same problem, the solution we came up with was to use Stapler instead https://github.com/hellerbarde/stapler, it's a python utility and only requires an extra module to be installed (pyPdf) on Heroku.

I've been oriented to this blog entry: http://theprogrammingbutler.com/blog/archives/2011/07/28/running-pdftotext-on-heroku/

Here are the steps I followed to install pyPdf:

Accessing the heroku bash console

heroku run bash

Installing the latest version of pyPdf

cd tmp
curl http://pybrary.net/pyPdf/pyPdf-1.13.tar.gz -o pyPdf-1.13.tar.gz
tar zxvf pyPdf-1.13.tar.gz
python setup.py install --user

This puts all the necessary files under a .local file at the root of the app. I just downloaded it and added it to our git repo, as well as the stapler utility. Finally I updated my code to use stapler instead of pdftk, et voilà! Splitting PDFs from Heroku again.

Another way, probably cleaner, would be to encapsulate it in a gem ( http://news.ycombinator.com/item?id=2816783 )

Arman H

I read a similar question on SO, and found this approach by Ryan Daigle that worked for me as well: instead of building local binaries that are hard to match to Heroku's servers, use the remote environment to compile and build the required dependencies. This is accomplished using the Vulcan gem, which is provided by Heroku.

Ryan's article "Building Dependency Binaries for Heroku Applications"

Another approach by Jon Magic (untested by me), is to download and compile the dependency directly through Heroku's bash, e.g. directly on the server: "Compiling Executables on Heroku".

On a side note, both approaches are going to result in binaries that are going to break if Heroku's underlying environment changes enough.

Try prawn.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!