问题
I've written a small lambda Python function that makes use of pdf2image to convert a PDF file's pages into separate JPG files. This library is a wrapper around poppler-utils
, in particular pdftoppm
. Works fine on my Ubuntu system, but of course AWS Lambda is different.
So I went to look for a version of poppler that I could compile on an EC2 instance for AWS Lambda and found the slightly dated Poppler-build ansible playbook. With some tweaks, I got it to work, however the Poppler version this playbook builds does not include the essential -jpeg
flag which turns ppms data straight into jpegs.
I tried compiling newer versions of Poppler but they all seem to involve libraries and C++ versions that I'd have to install separately and I worry that this will break AWS Lambda support.
Poppler 0.58 always balks at
make[3]: Entering directory `/home/ec2-user/poppler3/poppler-0.58.0/poppler' CXX
libpoppler_la-SignatureInfo.lo In file included from SignatureInfo.cc:22:0: /usr/include/nss3/hasht.h:48:29: error: ‘PRBool’ has not been declared void (*destroy)(void *, PRBool);
Poppler 0.69 (latest) on the other hand gets antsy about:
/home/ec2-user/poppler2/poppler-0.69.0/poppler/Annot.cc: In constructor ‘DefaultAppearance::DefaultAppearance(GooString*)’: /home/ec2-user/poppler2/poppler-0.69.0/poppler/Annot.cc:826:23: error: ‘make_unique’ is not a member of ‘std’ fontColor = std::make_unique(gatof(( (GooString *)daToks->get(i-1) )->getCString())); ^ /home/ec2-user/poppler2/poppler-0.69.0/poppler/Annot.cc:826:50: error: expected primary-expression before ‘>’ token fontColor = std::make_unique(gatof(( (GooString *)daToks->get(i-1) )->getCString()));
This endless trying to compile libraries and configuring with various flags and compilers etc has taken me over two days now and I'm ready to give up. Is there a chance someone can tell me what I'm doing wrong, or suggest another efficient way to turn PDF pages into JPEG files on AWS Lambda?
来源:https://stackoverflow.com/questions/52491849/converting-pdf-to-jpeg-on-aws-lambda