Efficient way to retrieve all _ids in ElasticSearch

后端未结

关注

 11  1812

What is the fastest way to get all _ids of a certain index from ElasticSearch? Is it possible by using a simple query? One of my index has around 20,000 documents.

相关标签:

11条回答

爱一瞬间的悲伤

2021-01-31 02:02

For Python users: the Python Elasticsearch client provides a convenient abstraction for the scroll API:

from elasticsearch import Elasticsearch, helpers
client = Elasticsearch()

query = {
    "query": {
        "match_all": {}
    }
}

scan = helpers.scan(client, index=index, query=query, scroll='1m', size=100)

for doc in scan:
    # do something

0 讨论(0)

误落风尘

2021-01-31 02:04

Url -> http://localhost:9200/<index>/<type>/_query
http method -> GET
Query -> {"query": {"match_all": {}}, "size": 30000, "fields": ["_id"]}

0 讨论(0)

甜味超标

2021-01-31 02:09
For elasticsearch 5.x, you can use the "_source" field.
```
GET /_search
{
    "_source": false,
    "query" : {
        "term" : { "user" : "kimchy" }
    }
}
```
"fields" has been deprecated. (Error: "The field [fields] is no longer supported, please use [stored_fields] to retrieve stored fields or _source filtering if the field is not stored")
0 讨论(0)
发布评论:

提交评论
- 加载中...

不思量自难忘°

2021-01-31 02:10

This is working!

def select_ids(self, **kwargs):
    """

    :param kwargs:params from modules
    :return: array of incidents
    """
    index = kwargs.get('index')
    if not index:
        return None

    # print("Params", kwargs)
    query = self._build_query(**kwargs)
    # print("Query", query)

    # get results
    results = self._db_client.search(body=query, index=index, stored_fields=[], filter_path="hits.hits._id")
    print(results)
    ids = [_['_id'] for _ in results['hits']['hits']]
    return ids

0 讨论(0)

说谎

2021-01-31 02:11

Inspired by @Aleck-Landgraf answer, for me it worked by using directly scan function in standard elasticsearch python API:

from elasticsearch import Elasticsearch
from elasticsearch.helpers import scan
es = Elasticsearch()
for dobj in scan(es, 
                 query={"query": {"match_all": {}}, "fields" : []},  
                 index="your-index-name", doc_type="your-doc-type"): 
        print dobj["_id"],

0 讨论(0)

上一页 1 2