ImportError: No module named textract

╄→гoц情女王★ 提交于 2020-01-15 09:20:39

问题


I have installed textract using pip install and ran the import command in jupyter notebook which is throwing the following error.

I am on a windows machine and have two versions of python installed(2.7 and 3.6) using conda. I have also added the paths to environment variables as suggested in other posts but still getting the error.

import textract

ImportErrorTraceback (most recent call last)
<ipython-input-2-99b3b0e1733d> in <module>()
     1 #Code to extract pdf files
----> 2 import textract
    3 text = textract.process("C:/Users/username/Documents/Projects/Attachments/PDF/fileA.pdf")

ImportError: No module named textract 

EDIT:

I was only successful in installing textract on python 2.7. I have added the below paths to the environment variables C:\Users\Username\AppData\Local\Continuum\anaconda3\envs\mypy27\ C:\Users\Username\AppData\Local\Continuum\anaconda3\envs\mypy27\Scripts---> this is where textract file is located C:\Users\Username\AppData\Local\Continuum\anaconda3\envs\mypy27\Lib\lib-tk C:\Users\Username\AppData\Local\Continuum\anaconda3\envs\mypy27\Lib C:\Users\Username\AppData\Local\Continuum\anaconda3\envs\mypy27\DLLs

UPDATE: I installed pypdf2 using pip install and tried importing it in juptyer notebooks. It returned the same error. I was wondering if I am installing things incorrectly.


回答1:


This worked for me on ubuntu

1.Open terminal

python -m venv env 
source ./env/bin/activate
sudo apt update
sudo apt install python-pip && pip install --upgrade pip
sudo apt install python-dev libxml2-dev libxslt1-dev antiword unrtf poppler-utils pstotext tesseract-ocr flac ffmpeg lame libmad0 libsox-fmt-mp3 sox libjpeg-dev swig
pip install textract

if you face any more errors:

try

pip install https://pypi.python.org/packages/ce/c7/ab6cd0d00ddf8dc3b537cfb922f3f049f8018f38c88d71fd164f3acb8416/SpeechRecognition-3.6.3-py2.py3-none-any.whl
sudo apt install libpulse-dev
pip install textract

Now you will be able to import textract

import textract
text = textract.process("/home/user/textract_test.pdf")



回答2:


This might be a workaround.

1.Uninstalled Anaconda and re-installed it.

2.Did not create any python 2.7 environment in anaconda and re-installed textract using pip along with all the other dependencies in the base anaconda command prompt.

3.Tried importing textract and it worked like a charm!



来源:https://stackoverflow.com/questions/50953880/importerror-no-module-named-textract

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!