pdfminer - ImportError: No module named pdfminer.pdfdocument

淺唱寂寞╮ 提交于 2019-11-27 07:04:45

问题


I am trying to install pdfMiner to work with CollectiveAccess. My host (pair.com) has given me the following information to help in this quest:

When compiling, it will likely be necessary to instruct the
installation to use your account space above, and not try to install
into the operating system directories. Typically, using "--
home=/usr/home/username/pdfminer" at the end of the install command should allow for that.

I followed this instruction when trying to install. The result was:

running install
running build
running build_py
running build_scripts
running install_lib
running install_scripts
changing mode of /usr/home/username/pdfminer/bin/latin2ascii.py to 755
changing mode of /usr/home/username/pdfminer/bin/pdf2txt.py to 755
changing mode of /usr/home/username/pdfminer/bin/dumppdf.py to 755
running install_egg_info
Removing /usr/home/username/pdfminer/lib/python/pdfminer-20140328.egg-info
Writing /usr/home/username/pdfminer/lib/python/pdfminer-20140328.egg-info

I don't see anything wrong with that (I'm very new to python), but when I try to run the sample command $ pdf2txt.py samples/simple1.pdf I get this error:

Traceback (most recent call last):   File "pdf2txt.py", line 3, in <module>
    from pdfminer.pdfdocument import PDFDocument ImportError: No module named pdfminer.pdfdocument

I'm running python 2.7.3. I can't install from root (shared hosting). The most recent version of pdfminer, which is 2014/03/28. I've seen some posts on similar issues ("no module named. . . " but nothing exactly the same. The proposed solutions either don't help (such as installing with sudo - not an option; specifying the path for python (which doesn't seem to be the issue), etc.).

Or is this a question for my host? (i.e., something amiss or different about their setup)


回答1:


Since the package pdfminer is installed to a non-standard/non-default location, Python won't be be able to find it. In order to use it, you will need to add it to your 'pythonpath'. Three ways:

  1. At run time, put this in your script pdf2txt.py:

    import sys
    # if there are no conflicting packages in the default Python Libs =>
    sys.path.append("/usr/home/username/pdfminer")
    

    or

    import sys
    # to always use your package lib before the system's =>
    sys.path.insert(1, "/usr/home/username/pdfminer")
    

    Note: The install path specified with --home is used as the Lib for all packages which you might want to install, not just this one. You should delete that folder and re-install with -- home=/usr/home/username/myPyLibs (or any generic name) so that when you install other packages with that install path, you would only need the one path to add to your local Lib to be able to import them:

    import sys
    sys.path.insert(1, "/usr/home/username/myPyLibs")
    
  2. Add it to PYTHONPATH before executing your script:

    export PYTHONPATH="${PYTHONPATH}:/usr/home/username/myPyLibs"
    

    And then put that in your ~/.bashrc file (/usr/home/username/.bashrc) or .profile as applicable. This may not work for programs which are not executed from the console.

  3. Create a VirtualEnv and install the packages you need to that.




回答2:


I had an error like this:

No module named 'pdfminer.pdfinterp'; 'pdfminer' is not a package

My problem was that I had named my script pdfminer.py which for the reasons that I don't know, Python took it for the original pdfminer package files and tried to compiled it.

I renamed my script to something else, deleted all the *.pyc file and __pycache__ directory and my problem was solved.



来源:https://stackoverflow.com/questions/35904738/pdfminer-importerror-no-module-named-pdfminer-pdfdocument

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!