Installing nltk data dependencies in setup.py script

荒凉一梦 提交于 2019-11-30 07:25:46

问题


I use NLTK with wordnet in my project. I did the installation manually on my PC, with pip: pip3 install nltk --user in a terminal, then nltk.download() in a python shell to download wordnet.

I want to automatize these with a setup.py file, but I don't know a good way to install wordnet.

For the moment, I have this piece of code after the call to setup ("nltk" is in the install_requires list of the call to setup):

import sys
if 'install' in sys.argv:
    import nltk
    nltk.download("wordnet")

Is there a better way to do this?


回答1:


I managed to install the NLTK data in setup.py by overriding cmdclass with my own Install class :

from setuptools import setup, find_packages
from setuptools.command.install import install as _install


class Install(_install):
    def run(self):
        _install.do_egg_install(self)
        import nltk
        nltk.download("popular")

setup(...
    cmdclass={'install': Install},
    ...
    install_requires=[
      'nltk',
      ],
    setup_requires=['nltk']
    ...
   )

It is important to use the method do_egg_install() in your run() method to make sure nltk gets installed, before import nltk is called (See also here python setuptools install_requires is ignored when overriding cmdclass). Also don't forget to add nltk to setup_requires.




回答2:


You can also automate installation with a shell script, for example, running (after pip installing nltk):

python -m nltk.downloader -d /usr/share/nltk_data wordnet


来源:https://stackoverflow.com/questions/26799894/installing-nltk-data-dependencies-in-setup-py-script

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!