问题
I have installed textract using pip install and ran the import command in jupyter notebook which is throwing the following error.
I am on a windows machine and have two versions of python installed(2.7 and 3.6) using conda. I have also added the paths to environment variables as suggested in other posts but still getting the error.
import textract
ImportErrorTraceback (most recent call last)
<ipython-input-2-99b3b0e1733d> in <module>()
1 #Code to extract pdf files
----> 2 import textract
3 text = textract.process("C:/Users/username/Documents/Projects/Attachments/PDF/fileA.pdf")
ImportError: No module named textract
EDIT:
I was only successful in installing textract on python 2.7. I have added the below paths to the environment variables C:\Users\Username\AppData\Local\Continuum\anaconda3\envs\mypy27\ C:\Users\Username\AppData\Local\Continuum\anaconda3\envs\mypy27\Scripts---> this is where textract file is located C:\Users\Username\AppData\Local\Continuum\anaconda3\envs\mypy27\Lib\lib-tk C:\Users\Username\AppData\Local\Continuum\anaconda3\envs\mypy27\Lib C:\Users\Username\AppData\Local\Continuum\anaconda3\envs\mypy27\DLLs
UPDATE: I installed pypdf2 using pip install and tried importing it in juptyer notebooks. It returned the same error. I was wondering if I am installing things incorrectly.
回答1:
This worked for me on ubuntu
1.Open terminal
python -m venv env
source ./env/bin/activate
sudo apt update
sudo apt install python-pip && pip install --upgrade pip
sudo apt install python-dev libxml2-dev libxslt1-dev antiword unrtf poppler-utils pstotext tesseract-ocr flac ffmpeg lame libmad0 libsox-fmt-mp3 sox libjpeg-dev swig
pip install textract
if you face any more errors:
try
pip install https://pypi.python.org/packages/ce/c7/ab6cd0d00ddf8dc3b537cfb922f3f049f8018f38c88d71fd164f3acb8416/SpeechRecognition-3.6.3-py2.py3-none-any.whl
sudo apt install libpulse-dev
pip install textract
Now you will be able to import textract
import textract
text = textract.process("/home/user/textract_test.pdf")
回答2:
This might be a workaround.
1.Uninstalled Anaconda and re-installed it.
2.Did not create any python 2.7 environment in anaconda and re-installed textract using pip along with all the other dependencies in the base anaconda command prompt.
3.Tried importing textract and it worked like a charm!
来源:https://stackoverflow.com/questions/50953880/importerror-no-module-named-textract