elasticsearch-py scan and scroll to return all documents

后端 未结 2 1640
星月不相逢
星月不相逢 2021-02-07 01:38

I am using elasticsearch-py to connect to my ES database which contains over 3 million documents. I want to return all the documents so I can abstract data and write it to a csv

2条回答
  •  借酒劲吻你
    2021-02-07 01:57

    The python scan method is generating a GET call to the rest api. It is trying to send over your scroll_id over http. The most likely case here is that your scroll_id is too large to be sent over http and so you are seeing this error because it returns no response.

    Because the scroll_id grows based on the number of shards you have it is better to use a POST and send the scroll_id in JSON as part of the request. This way you get around the limitation of it being too large for an http call.

提交回复
热议问题