How to fix ElasticSearch conflicts on the same key when two process writing at the same time

前端 未结 3 1409
滥情空心
滥情空心 2021-02-04 02:24

I have multiple processes to write data to ES at the same time, also two processes may write the same key with different values at the same time, it caused the exception as foll

相关标签:
3条回答
  • 2021-02-04 02:52

    VersionConflictEngineException is thrown to prevent data loss. Every document in elasticsearch has a _version number that is incremented whenever a document is changed.

    When you query a doc from ES, the response also includes the version of that doc. When you update the same doc and provide a version, then a document with the same version is expected to be already existing in the index.

    If the current version is greater than the one in the update request, What we would get now is a conflict, with the HTTP error code of 409 and VersionConflictEngineException

    In your current scenario,

    version conflict, current 2, provided 1

    The current version in ES is 2 whereas in your request is 1 which means some other thread has already modified the doc and your change is trying overwrite the doc.

    In case of VersionConflictEngineException, you should re-fetch the doc and try to update again with the latest updated version.

    Whether or not to use the versioning / Optimistic Concurrency Control, depends on the application. If you can live with data-loss, you may avoid passing version in the update request.

    0 讨论(0)
  • 2021-02-04 02:57

    You could also plan for this by using the elastic search external versioning system and maintain the document versions manually as stated below

    0 讨论(0)
  • 2021-02-04 03:01

    The ES provides the ability to use the retry_on_conflict query parameter.

    Specify how many times should the operation be retried when a conflict occurs. Default: 0.

    If you have several parallel scripts that can simultaneously work with the same document, you can use this parameter.

    For example: You have an index for tweets. And 5 processes that will work with this index. It is possible that all 5 scripts will work with the same document (some tweet). In this case, you can use the ...&retry_on_conflict=6 parameter. Why 6? 5 processes + 1 (plus some legroom). Thus, the ES will try to re-update the document up to 6 times if conflicts occur.

    0 讨论(0)
提交回复
热议问题