Slow index speed of Elasticsearch
Upgrade your ES to latest version, because in recent releases they have made it more production friendly and most stable release now is the latest one 2.3
You can try following things to make indexing go faster:
- Make some master nodes, separate from Data nodes as it will reduce load on all your cluster.
- Disable OS swapping, ES takes care of that and Check your heap size on all your machines Heap Sizing
- Check your documents are of similar size always, you can make use of bulk indexing and tweak you settings in there like chunk_size in number of records or in memory size
- If you are using script try to optimize that as they make the indexing slow, you can store the scripted value if possible as preprocessing, as ES is not designed to handle scripting.
- Check number of shards per node and try to balance that out across nodes using Routing
- Read more on how ES guys suggest production ready system to work Elasticsearch in Production
- One more blog on increasing Elasticsearch Indexing performance Performance Considerations for Elasticsearch Indexing
Check this answer for optimal way to setup ELK Stack on three servers. Optimal way to set up ELK stack on three servers
Related videos on Youtube
PeiSong Xiong
Updated on September 15, 2022Comments
-
PeiSong Xiong about 1 year
We deployed ES 2.0 on 3 EC2 c4.4xlarge(16 cores, 32gb memory) nodes, allocating 16G for ES, attached 500GB with io1/4000 IOPS on each.
Problem : We are expecting great performance from this hardware config, however a very slow indexing speed is observed.
Our document is about 10-50k in size, we are using Java transport client to insert. The speed was alright for the first 50,000 at roughly 1000/second, and dramatically slow down to 100-200/second.
In the meanwhile we are looking at the low resource consumption:
- CPU is about 1-20% only (16 Core CPU)
- IO write is about 4-10Mb/second only
- Memory consumption is about 20-30% only
Requirements :So I cannot understand why it is so slow while all the recourses are so free, what can I do to enhance the efficiency? Thanks.
Here is the config file we are using:
cluster.name: {{ env }}-{{ app }} path.data: /data/es path.logs: /data/es-logs network.host: 0.0.0.0 discovery.zen.ping.unicast.hosts: ["xxxx"] bootstrap.mlockall: true threadpool.search.queue_size: 300 threadpool.index.type: fixed threadpool.index.size: 16 threadpool.index.queue_size: 250000 index.refresh_interval: 1s index.translog.flush_threshold_ops: 50000 indices.memory.index_buffer_size: 30% indices.memory.min_shard_index_buffer_size: 12mb indices.memory.min_index_buffer_size: 96mb script.inline: on script.indexed: on http.cors.enabled: true http.cors.allow-origin: /https?:\/\/localhost(:[0-9]+)?/