Planning High Throughput Elasticsearch Clusters: Part 1

Gepostet am: 09. März 2020

Building an Elasticsearch cluster for log ingestion can be as easy as assembling an Ikea Billy bookshelf and sizing as complex as engineering a 24 hours of Le Mans race car. In this article series we will put a spotlight on what has to be taken into account when planning those clusters. We will take a look at index management strategies and node resource choices, then tune the cluster, indices and shippers in order to achieve optimal data ingestion performance.

Mapping Out the Cluster

Elasticsearch does not come with a one size fits all recipe in any shape or form—use cases and data shapeu a cluster’s contours. The complexity ranges from a simple yet sufficient single node setup to a distributed multinode cluster with dedicated masters, data and ingest nodes.

The amount of data nodes is determined by data sources, JVM heap size, index lifecycling, data growth and amount of shards. In fact, we cannot properly size and optimize a cluster as long as it does not get hit by production workload. However, by closely examining our use case and data, we can estimate the cluster size, then refine and optimize it further. The process can be applied to any Elasticsearch version greater than 6.7 up to release 8.

Planning

Building an Elasticsearch cluster follows an iterative procedure:

  • Clarify the use case
  • Examine the data and its sources
  • Define an index management strategy
  • Draft the cluster based on that information
  • Benchmark with test data
  • Refine and tune the cluster with test and real data

The draft must respect certain boundaries:

  1. A shard has a capacity of 2.1 billion (2^31) documents.
  2. Elastic recommends having a memory/shard distribution of 1:20 (1GB heap manages 20 shards) in order to guarantee best performance. Although it is not a hard limit, we want to stay close to that number.
  3. It is advised to not grow a shard larger than 60GB to 80GB for two main reasons:
    1. In case of a failure, it will take longer to recover and rebalance big shards than it would with small shards.
    2. Merge operations are I/O intensive jobs. Long CPU and disk consumption cycles can lead to gimped index throughput and worse: dropped messages.

Alongside these boundaries, we define the index management and data retention strategies. Once the initial draft is made, it’s ready for a benchmark with test and real data.

Drafting the Elasticsearch Cluster

Our main concern is how data is managed in indices. The way we organize indices and their lifecycles, will affect node resources and therefore influence the cluster size. Thus, we’re primarily sizing a cluster based on our index management and housekeeping decisions, and secondarily for throughput.

Choosing a Proper Heap Size

Although it is tempting to set the JVM heap to the maximum configurable size, it is advised to follow the setup guide and avoid Elasticsearch to start with compressed oops. Generally speaking, we recommend to start low and increase the heap as needed. A Heap of Trouble discusses the implications of too big and too small JVM heap sizes.

TL;DR:

[…] If the heap is too small, applications will be prone to the danger of out of memory errors. […] If the heap is too large, the application will be prone to infrequent long latency spikes from full-heap garbage collections. […] a long pause is indistinguishable from a node that is unreachable because it is hung […].

[…] it’s better to set the heap as low as possible while satisfying your requirements for indexing and query throughput, end-user query response times, yet large enough to have adequate heap space for indexing buffers, and large consumers of heap space like aggregations, and suggesters.

Heap size needs to be monitored and adjusted during production workload. 8GB to 12GB heap size might be a reasonable starting point, unless there is a tremendous amount of data, users and complex queries expected right from the start.

Planning Index Organization

In order to estimate the resources needed, we have to examine our data and sources. Five things will help with making the first draft:

  • data retention periods
  • data structures and affiliation
  • amount of data received per given time period
  • number of clients shipping data
  • number of clients querying data

The number of clients becomes valuable, when tuning the cluster for data ingestion and calculating the client shard concurrency. We will take a closer look at concurrency, when benchmarking the cluster. For now, we do not have to take it into account. The expected data per day will give us a hint whether or not basic index management will be sufficient and if it’s necessary to further enhance index management or the whole cluster.

Shards Influence Cluster Size

The total amount of shards is the result of how indices and their lifecycles are managed. If we commit to the 1:20 ratio, a node with 8GB heap is capable of managing 160 shards. A single node can store 80 daily rotated indices with 1 shard and 1 replica configured. If we set index.number_of_shards to 2, the node has a capacity of 40 indices.

Consolidate Data by Affiliation

Logs come from many different sources/applications. They can be either structured or unstructured. In order to reduce the amount of indices, and therefore the amount of shards, we should consider storing logs with similar structure in the same index. Grouping logs by structure and affiliation instead of origin, will help with keeping the cluster small.

Having three applications with similar structure logging into separate indices with 1 shard and 1 replica configured leaves us with 180 shards after 30 days for a single environment. Three environments would generate 540 shards. That equates to at least three data nodes with 8GB heap each. The same applications shipping data into a single index would result in just 180 shards for three environments.

A common example for data affiliation is a highly available web application with a middleware, haproxy and a database. None of these components share a log structure, but the logs are affiliated, due to the fact that they are application components. Whenever we want to track logs across several different components and correlate them, it makes sense to ship the data into the same index, in order to make that data  accessible more easily in Kibana.

Index Patterns

The most basic and common index is daily rotated and environment specific. A pattern like logs-%{environment}-%{YYYY.mm.dd} is most likely to encounter. Depending on how much data a single index receives, this might be sufficient, or a waste of shards. Some applications might generate 300GB or more per day, others might only generate 5GB or less. Whenever small indices are encountered, it is worth reconsidering the daily rotation and applying a weekly rotation instead to use shards more efficiently.

When data is consolidated by affiliation and grouped by environment, how do we find a specific application log? Filebeat needs an input for each log file and add extra fields, so that a user can query on that field and find the applications log events.

In order to route the events into the correct index, we need to configure conditions on output.elasticsearch.

The host on which the application is running will be added to the event by filebeat automatically.

 

Up Next

Part 2 will focus on ILM pros and cons, shard allocation in a hot-warm-cold architecture, explain client shard concurrency and how it affects cluster size.

2020-03-09T17:31:10+00:00

Hat dir der Beitrag gefallen?