spaCy and spaCy models in setup.py

六眼飞鱼酱① 提交于 2020-07-17 07:50:10

问题


In my project I have spaCy as a dependency in my setup.py, but I want to add also a default model.

My attempt so far has been:

install_requires=['spacy', 'en_core_web_sm'],
dependency_links=['https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm'],

inside my setup.py, but both a regular pip install of my package and a pip install --process-dependency-links return:

pip._internal.exceptions.DistributionNotFound: No matching distribution found for en_core_web_sm (from mypackage==0.1)

I found this github issue from AllenAI with the same problem and no solution.

Note that if I pip install the url of the model directly, it works fine, but I want to install it as a dependency when my package is install with pip install.


回答1:


You can use pip's recent support for PEP 508 URL requirements:

install_requires=[
    'spacy',
    'en_core_web_sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz',
],

Note that this requires you to build your project with up-to-date versions of setuptools and wheel (at least v0.32.0 for wheel; not sure about setuptools), and your users will only be able to install your project if they're using at least version 18.1 of pip.

More importantly, though, this is not a viable solution if you intend to distribute your package on PyPI; quoting pip's release notes:

As a security measure, pip will raise an exception when installing packages from PyPI if those packages depend on packages not also hosted on PyPI. In the future, PyPI will block uploading packages with such external URL dependencies directly.




回答2:


Here is my workaround for a PyPi-installable package (edited slightly for clarity):

try:
    nlp = spacy.load('en')
except OSError:
    print('Downloading language model for the spaCy POS tagger\n'
        "(don't worry, this will only happen once)", file=stderr)
    from spacy.cli import download
    download('en')
    nlp = spacy.load('en')

It's cumbersome, but at least it works without having to involve the user. I'm trying to convince the spaCy team to package the most important model files for PyPi.




回答3:


Not sure if this works for you, but in setup.py you might try:

os.system('python -m spacy download en')

after calling setuptools.setup(...)

edit:

According to spaCy docs, it looks like you can now add SpaCy models to your requirements.txt via url as well. You should then be able to import the model as a module where it is required:

import en_core_web_sm
nlp = en_core_web_sm.load()

Ref: https://spacy.io/usage/models



来源:https://stackoverflow.com/questions/53383352/spacy-and-spacy-models-in-setup-py

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!