Slow index speed of Elasticsearch

15,152

Upgrade your ES to latest version, because in recent releases they have made it more production friendly and most stable release now is the latest one 2.3

You can try following things to make indexing go faster:

  1. Make some master nodes, separate from Data nodes as it will reduce load on all your cluster.
  2. Disable OS swapping, ES takes care of that and Check your heap size on all your machines Heap Sizing
  3. Check your documents are of similar size always, you can make use of bulk indexing and tweak you settings in there like chunk_size in number of records or in memory size
  4. If you are using script try to optimize that as they make the indexing slow, you can store the scripted value if possible as preprocessing, as ES is not designed to handle scripting.
  5. Check number of shards per node and try to balance that out across nodes using Routing
  6. Read more on how ES guys suggest production ready system to work Elasticsearch in Production
  7. One more blog on increasing Elasticsearch Indexing performance Performance Considerations for Elasticsearch Indexing

Check this answer for optimal way to setup ELK Stack on three servers. Optimal way to set up ELK stack on three servers

Share:
15,152

Related videos on Youtube

PeiSong Xiong
Author by

PeiSong Xiong

Updated on September 15, 2022

Comments

  • PeiSong Xiong
    PeiSong Xiong about 1 year

    We deployed ES 2.0 on 3 EC2 c4.4xlarge(16 cores, 32gb memory) nodes, allocating 16G for ES, attached 500GB with io1/4000 IOPS on each.

    Problem : We are expecting great performance from this hardware config, however a very slow indexing speed is observed.

    Our document is about 10-50k in size, we are using Java transport client to insert. The speed was alright for the first 50,000 at roughly 1000/second, and dramatically slow down to 100-200/second.

    In the meanwhile we are looking at the low resource consumption:

    1. CPU is about 1-20% only (16 Core CPU)
    2. IO write is about 4-10Mb/second only
    3. Memory consumption is about 20-30% only

    Requirements :So I cannot understand why it is so slow while all the recourses are so free, what can I do to enhance the efficiency? Thanks.

    Here is the config file we are using:

    cluster.name: {{ env }}-{{ app }}
    path.data: /data/es
    path.logs: /data/es-logs
    network.host: 0.0.0.0
    discovery.zen.ping.unicast.hosts: ["xxxx"]
    bootstrap.mlockall: true
    threadpool.search.queue_size: 300
    threadpool.index.type: fixed
    threadpool.index.size: 16
    threadpool.index.queue_size: 250000
    index.refresh_interval: 1s
    index.translog.flush_threshold_ops: 50000
    indices.memory.index_buffer_size: 30%
    indices.memory.min_shard_index_buffer_size: 12mb
    indices.memory.min_index_buffer_size: 96mb
    script.inline: on
    script.indexed: on
    http.cors.enabled: true
    http.cors.allow-origin: /https?:\/\/localhost(:[0-9]+)?/
    

    Here is htop and iostat while running the job: htop

    iostat