Elasticsearch / Kibana field data too large

前端 未结 2 671
长情又很酷
长情又很酷 2021-02-08 04:10

I have a small ELK cluster that is in testing. The kibana web interface is extremely slow and throws a lot of errors.

Kafka => 8.2
Logstash => 1.5rc3 (latest)

2条回答
  •  礼貌的吻别
    2021-02-08 04:39

    You're indexing a lot of data (if you're adding/creating 5 to 20GB a day) and your nodes are quite low on memory. You won't see any problems on the indexing front but fetching data on a single or multiple indexes will cause problems. Keep in mind that Kibana runs queries in the background and the message you're getting is basically saying something along the lines of "I can't get that data for you because I need to put more data in memory than I have available in order to run these queries."

    There are two things that are relatively simple to do and should solve your problems:

    • Upgrade to ElasticSearch 1.5.2 (Major performance improvements)
    • When you're short on memory, you really need to use doc_values in all of your mappings as this will reduce the heap size drastically

    The key lies in doc_values though. You need to modify your mapping(s) to set this property to true. Crude example:

    [...],
    "properties": {
        "age": {
          "type": "integer",
          "doc_values": true
        },
        "zipcode": {
          "type": "integer",
          "doc_values": true
        },
        "nationality": {
          "type": "string",
          "index": "not_analyzed",
          "doc_values": true
        },
        [...]
    

    Updating your mapping(s) will make future indexes take this into account but you'll need to reindex existing ones entirely for doc_values to apply on existing indexes. (See scan/scroll and this blog post for more tips.)

    Replicas help scale but will run into the same problems if you don't reduce the heap size of each node. As for the number of shards you currently have, it may not be necessary nor optimal but I don't think it's the root cause of your problems.

    Keep in mind that the suggestions mentioned above are to allow Kibana to run the queries and show you data. Speed will rely greatly on the date ranges you set, on the machines you have (CPU, SSD, etc), and on the memory available on each node.

提交回复
热议问题