How does searching with pip work?

旧街凉风 提交于 2021-02-08 15:11:12

问题


Yes, I'm dead serious with this question. How does searching with pip work?

The documentation of the keyword search refers to a "pip search reference" at https://pip.pypa.io/en/stable/user_guide/#searching-for-packages which is everything but a reference.

I can't conclude from search attempts how searching works. E.g. if I search for "exec" I get a variety of results such as exec-pypeline (0.4.2) - an incredible python package. I even get results with package names that have nothing to do with "exec" as long as the term "exec" is in the description.

But strangely I don't see one of my own packages in the list though one of the packages contains exec in it's name. That alone now would lead us to the conclusion that pip (at least) searches for complete search terms in the package description (which my package doesn't have).

But building on that assumption if I search for other terms that are provided in the package description I don't get my package listed either. And that applies to other packages as well: E.g. if I search for "projects" I don't get flask-macros in the result set though the term "projects" clearly exists in the description of flask-macros. So as this contradicts the assumption above this is clearly not the way how searching works.

And interestingly I can search for "macro" and get "flask-macros" as a result, but if I search for "macr" "flask-macros" is not found.

So how exactly is searching performed by pip? Where can a suitable reference be found for this?


回答1:


pip search looks for substring contained in the distribution name or the distribution summary. I can not see this documented anywhere, and found it by following the command in the source code directly.

The code for the search feature, which dates from Feb 2010, is still using an old xmlrpc_client. There is issue395 to change this, open since 2011, since the XML-RPC API is now considered legacy and should not be used. Somewhat surprisingly, the endpoint was not deprecated in the pypi-legacy to warehouse move, as the legacy routes are still there.

flask-macros did not show up in a search for "project" because this is too common a search term. Only 100 results are returned, this is a hardcoded limit in the elasticsearch view which handles the requests to those PyPI search routes. Note that this was reduced from 1000 fairly recently in PR3827.

Code to do a search with an API client directly:

import xmlrpc.client

client = xmlrpc.client.ServerProxy('https://pypi.org/pypi')
query = 'project'
results = client.search({'name': query, 'summary': query}, 'or')
print(len(results), 'results returned')
for result in sorted(results, key=lambda data: data['name'].lower()):
    print(result)

edit: The 100 result limit is now documented here.



来源:https://stackoverflow.com/questions/51269053/how-does-searching-with-pip-work

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!