In this post, we will be talking about how to make Elasticsearch(1.7.x) more stable and performant.
Before we start, you can see the difference between test results;
|Before Tuning||After Tuning|
|Total time||10.94 s||4.73 s|
|Average||1.92 s||0.76 s|
|Fastest||0.17 s||0.09 s|
|Slowest||4.95 s||2.74 s|
Now we can start with tuning OS level settings which mentioned in ES documentations;
First things first, let’s get OS(Ubuntu 14.04) ready. Elasticsearch requires only Java(>1.7). Newer ES versions may require higher version of java.
ES recommends to set this value 1, also according to Red Hat, a low swappiness value is recommended for database workloads. As an example, for Oracle databases, Red Hat recommended swappiness value is 10. For further reading Tuning Virtual Memory.
Why do we set this value to 1 instead of 0?
– Setting swappiness to 0 more aggressively avoids swapping out, which increases the risk of OOM killing under strong memory and I/O pressure.
Maximum number of connection an application can request.
This property allows for the restriction of the number of VMAs (Virtual Memory Areas) that a particular process can own. When it reaches the limit, out of memory error will be thrown.
Sets the maximum number of file-handles that the Linux kernel will allocate.
session required pam_limits.so
The pam_limits PAM module sets limits on the system resources that can be obtained in a user-session.
Tries to lock the process address space into RAM, preventing any Elasticsearch memory from being swapped out. This attribute provides JVM to lock its memory block and protects it from OS to swap this memory block. This is kind of performance optimization.
Field data cache is unbounded. This, of course, could make your JVM heap explode.To avoid nasty surprises we limit this with 30%.(affects search performance)
Even though filters are relatively small, they can take up large portions of the JVM heap if you have a lot of data and numerous different filters. So we limit this with 30%.
The default installation of Elasticsearch is configured with a 1 GB heap. According to our long researches this value should be the half size of total RAM. Should not cross 30.5 GB!
Support for compression when possible (with Accept-Encoding). Defaults to false. We use it with gzip. Made huge impact on performance.
discovery.zen.ping.unicast.hosts: [“host1”, “host2:port”]
Very important setting, it prevents clusters from complications. Newer versions come with default unicast.
This setting set according to this calculation (number of master-eligible nodes / 2) + 1.
This setting prevents deleting index with wildcards *. Requires full name.
you can prevent the automatic creation of indices by adding this setting to the config/elasticsearch.yml file on each node.
- zen.ping.timeout: 10s
- zen.fd.ping_retries: 3
- zen.fd.ping_interval: 3s
- zen.fd.ping_timeout: 30s
Set these settings to tolerate error rate and prevent undesired connection losses between nodes.
Open the sysctl.conf;
Add these properties;
vm.swappiness=1 # turn off swapping net.core.somaxconn=65535 # up the number of connections per port vm.max_map_count=262144 #(default) http://www.redhat.com/magazine/001nov04/features/vm fs.file-max=518144 # http://www.tldp.org/LDP/solrhe/Securing-Optimizing-Linux-RH-Edition-v1.3/chap6sec72.html
After that, go to the limits.conf;
The important thing is, which user is defined below. Our ES user should access these informations. It is recommended that using specific user for such big applications.(We did it in Redis too.) This user name is default when you installed the ES.
elasticsearch soft nofile 65535 elasticsearch hard nofile 65535 elasticsearch - memlock unlimited
and to make these properties persistent you have to modify the;
Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out. Increase RLIMIT_MEMLOCK (limit). error before making this change. But thanks to mrzard we got rid of this problem by setting this unlimited.
Just in case, soft limit can be temporarily exceeded by the user, but the system will not allow a user to exceed hard limit. We just go strict with this so we set both the same value.
vi /etc/pam.d/common-session-noninteractive vi /etc/pam.d/common-session
Add this property;
session required pam_limits.so
You may need to reboot the machine to those changes to be applied.
Now everyting is ready for the Elasticsearch to be installed. You can use this bash script.Here you can get the script.
After that execute;
You can choose whichever you want. We strongly recommend you to install Elasticsearch this way. If you download the tar.gz and go with that way, you have to create your init scripts and also you have to create Elasticsearch user which is very important to make configuration easier. Anyway, let’s assume you installed it with the script. Now you have elasticsearch.yml and logging.yml files under;
In this part, let’s open the elasticsearch.yml. We only show you the places that need to be shown. All other settings are default.
bootstrap.mlockall: true action.auto_create_index: false action.destructive_requires_name: true indices.fielddata.cache.size: 30% indices.cache.filter.size: 30% http.compression: true discovery.zen.minimum_master_nodes: 2 discovery.zen.ping.multicast.enabled: false discovery.zen.ping.unicast.hosts: ["ip-of-machine-1", "ip-of-machine-2", "ip-of-machine-3"] discovery.zen.ping.timeout: 10s discovery.zen.fd.ping_retries: 3 discovery.zen.fd.ping_interval: 3s discovery.zen.fd.ping_timeout: 30s
After that, let’s go to elasticsearch start script.
One of the most important thing in ES, heap size. As much as we searched, mostly heap size should be half of total ram size and also should not be more than 30.5GB.
# Heap size defaults to 256m min, 1g max # Set ES_HEAP_SIZE to 50% of available RAM, but no more than 31g ES_HEAP_SIZE=4g
Finally, you can check the properties for our ES user;
su elasticsearch --shell /bin/bash --command "cat /proc/sys/vm/swappiness " su elasticsearch --shell /bin/bash --command "cat /proc/sys/net/core/somaxconn" su elasticsearch --shell /bin/bash --command "cat /proc/sys/vm/max_map_count " su elasticsearch --shell /bin/bash --command "cat /proc/sys/fs/file-max " su elasticsearch --shell /bin/bash --command "ulimit -n" su elasticsearch --shell /bin/bash --command "ulimit -Sn" su elasticsearch --shell /bin/bash --command "ulimit -Hn"
You can reboot the machines and check your cluster status from sense
or check every node from console
If your nodes don’t start on startup, probably your init scripts did not installed properly. Use this command and reboot.
sudo update-rc.d elasticsearch defaults 95 10
If you get this exception
org.elasticsearch.transport.RemoteTransportException check your nodes to know which version of java is installed.