ElasticSearch: Unassigned Shards, how to fix?

前端 未结 24 1058
悲&欢浪女
悲&欢浪女 2020-12-04 05:03

I have an ES cluster with 4 nodes:

number_of_replicas: 1
search01 - master: false, data: false
search02 - master: true, data: true
search03 - master: false,          


        
相关标签:
24条回答
  • 2020-12-04 05:22

    The only thing that worked for me was changing the number_of_replicas (I had 2 replicas, so I changed it to 1 and then changed back to 2).

    First:

    PUT /myindex/_settings
    {
        "index" : {
            "number_of_replicas" : 1
         }
    }
    

    Then:

    PUT /myindex/_settings
    {
        "index" : {
            "number_of_replicas" : 2
         }
    }
    

    (I Already asnwered it in this question)

    0 讨论(0)
  • 2020-12-04 05:22

    In my case, the hard disk space upper bound was reached.

    Look at this article: https://www.elastic.co/guide/en/elasticsearch/reference/current/disk-allocator.html

    Basically, I ran:

    PUT /_cluster/settings
    {
      "transient": {
        "cluster.routing.allocation.disk.watermark.low": "90%",
        "cluster.routing.allocation.disk.watermark.high": "95%",
        "cluster.info.update.interval": "1m"
      }
    }
    

    So that it will allocate if <90% hard disk space used, and move a shard to another machine in the cluster if >95% hard disk space used; and it checks every 1 minute.

    0 讨论(0)
  • 2020-12-04 05:23

    I ran into exactly the same issue. This can be prevented by temporarily setting the shard allocation to false before restarting elasticsearch, but this does not fix the unassigned shards if they are already there.

    In my case it was caused by lack of free disk space on the data node. The unassigned shards where still on the data node after the restart but they where not recognized by the master.

    Just cleaning 1 of the nodes from the disk got the replication process started for me. This is a rather slow process because all the data has to be copied from 1 data node to the other.

    0 讨论(0)
  • 2020-12-04 05:25

    Might help, but I had this issue when trying to run ES in embedded mode. Fix was to make sure the Node had local(true) set.

    0 讨论(0)
  • 2020-12-04 05:28

    By default, Elasticsearch will re-assign shards to nodes dynamically. However, if you've disabled shard allocation (perhaps you did a rolling restart and forgot to re-enable it), you can re-enable shard allocation.

    # v0.90.x and earlier
    curl -XPUT 'localhost:9200/_settings' -d '{
        "index.routing.allocation.disable_allocation": false
    }'
    
    # v1.0+
    curl -XPUT 'localhost:9200/_cluster/settings' -d '{
        "transient" : {
            "cluster.routing.allocation.enable" : "all"
        }
    }'
    

    Elasticsearch will then reassign shards as normal. This can be slow, consider raising indices.recovery.max_bytes_per_sec and cluster.routing.allocation.node_concurrent_recoveries to speed it up.

    If you're still seeing issues, something else is probably wrong, so look in your Elasticsearch logs for errors. If you see EsRejectedExecutionException your thread pools may be too small.

    Finally, you can explicitly reassign a shard to a node with the reroute API.

    # Suppose shard 4 of index "my-index" is unassigned, so you want to
    # assign it to node search03:
    curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
        "commands": [{
            "allocate": {
                "index": "my-index",
                "shard": 4,
                "node": "search03",
                "allow_primary": 1
            }
        }]
    }'
    
    0 讨论(0)
  • 2020-12-04 05:28

    OK, I've solved this with some help from ES support. Issue the following command to the API on all nodes (or the nodes you believe to be the cause of the problem):

    curl -XPUT 'localhost:9200/<index>/_settings' \
        -d '{"index.routing.allocation.disable_allocation": false}'
    

    where <index> is the index you believe to be the culprit. If you have no idea, just run this on all nodes:

    curl -XPUT 'localhost:9200/_settings' \
        -d '{"index.routing.allocation.disable_allocation": false}'
    

    I also added this line to my yaml config and since then, any restarts of the server/service have been problem free. The shards re-allocated back immediately.

    FWIW, to answer an oft sought after question, set MAX_HEAP_SIZE to 30G unless your machine has less than 60G RAM, in which case set it to half the available memory.

    References

    • Shard Allocation Awareness
    0 讨论(0)
提交回复
热议问题