How can I install pdftotext properly?
I'm getting the error message below when installing pdftotext in Python 3.6. I also tried to install the package manually by downloading the zip file but still got the same error.
pdftotext/pdftotext.cpp(4): fatal error C1083: Cannot open include file: 'poppler/cpp/poppler-document.h': No such file or directory
error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\BIN\\x86_amd64\\cl.exe' failed with exit status 2
I found some help in the Readme.md file in the pdftotext package :
1) Install OS Dependencies :
on Debian, Ubuntu, and friends:
sudo apt-get update
sudo apt-get install build-essential libpoppler-cpp-dev pkg-config python-dev
on Fedora, Red Hat, and friends:
sudo yum install gcc-c++ pkgconfig poppler-cpp-devel python-devel redhat-rpm-config
2) Do the normal install :
pip install pdftotext
and it worked for me.
Below command solved the problem for me.
sudo apt-get install libpoppler-cpp-dev
https://blog.droidzone.in/2018/05/01/install-pdftotext-python-extension-error/
And for mac os: brew install poppler
I've been trying to figure out how to install pdftotext on Win10 for a few days. Internet searches have given me nothing. So for those who need to know, here's installing pdftotext on Win10 with Anaconda. YMMV.
Install Anaconda Python. There are many articles on installing Anaconda, so I won't explore that here.
Try to run pip install pdftotext, you will get an error that the Microsoft Visual C++ is required.
Navigate in a browser to http://visualstudio.microsoft.com/downloads. Under the Tools for Visual Studio 2019 tab download the Build Tools for Visual Studio 2019. You’ll then install the tools by checking the C++ build tools option box and clicking Install.
You should now get the pip install to move past the VC++ error. Unfortunately you’ll now get the error “Cannot open include file: ‘poppler/cpp/poppler-document.h’. This is because you’re missing the poppler libraries.
Head back to the internets! You’ll need poppler for windows. At the time of this writing, your best option is http://blog.alivate.com.au/poppler-windows. Grab the latest binary, and uncompress it. If you look at the error, pip is looking for the header file at {Anaconda3 directory}\include\poppler\cpp\poppler-document.h. So look in the archive you just unzipped. In the include folder, you’ll see a poppler directory. If you go down into the cpp directory in there you’ll find the poppler-document.h file.
I copied the entire poppler directory into the Anaconda3\include folder, so do that.
If you try to run pip install again, you'll still get a ton of errors! But these are not any of the errors that you saw previously, instead this error is looking for a missing linked library, poppler-cpp.lib. A search through Conda installs on another machine found this file in the poppler package. So
conda install -c conda-forge poppler
Which will install our poppler-cpp.lib file. Then we can copy the file from its home at {Anaconda3 directory}\Library\lib\poppler-cpp.lib and paste it where pdftotext is expecting it at {Anaconda3 directory}\libs.
If we do a pip install pdftotext again, there it is! I’m sure someone will find a way to refine this a bit, but for now we have a working pdftotext Python library on Win10.
These directions can be found, with screenshots, at my blog https://coder.haus/2019/09/27/installing-pdftotext-through-pip-on-windows-10/
For Ubuntu users
sudo apt-get install libpoppler58=0.41.0-0ubuntu1 libpoppler-dev libpoppler-cpp-dev
worked for me
来源:https://stackoverflow.com/questions/45912641/unable-to-install-pdftotext-on-python-3-6-missing-poppler