Kubernetes Logging with Fluentd and the Elastic Stack

Kubernetes and Docker are great tools to manage your microservices, but operators and developers need tools to debug those microservices if things go south. Log messages and application metrics are the usual tools in this cases. To centralize the access to log events, the Elastic Stack with Elasticsearch and Kibana is a well-known toolset. In this blog post I want to show you how to integrate the logging of Kubernetes with the Elastic Stack. To start off, I will give an introduction to the log mechanism of Kubernetes, then I’ll show you how to collect the resulting log events and ship them into the Elastic Stack. I also provide a GitHub repository with a working demo. Finally, I highlight some considerations for the production deployment. Weiterlesen

Getting started with Kibana [Links]

You have huge data sets to analyze? You want to gain insights into your gigabytes of logs? The Elastic Stack (Elasticsearch, Logstash, Beats, Kibana) offers you a great set of tools for that. After you got your logs or other data into Elasticsearch, Kibana will offer you a great UI to deep dive into your data. But how to get started with Kibana? Weiterlesen

Drastic Elastic [Part 4]: Aggregations & Plugins

In an earlier post in this mini-series I mentioned that the aggregated data we persist in ELasticSearch has discrete retention times:

  • 5 minute aggregation => (retention time of) one day
  • hourly aggregations => 7 days
  • daily aggregations => 5 years

This means that we reach well over 50% of our total data retention after one week (the only additions after that point are daily aggregations while data at other aggregation levels gets refreshed/updated) – after 4 or 5 weeks we had something like 8 billion documents in ElasticSearch amounting to 13 TB of data.

In this last article of our four part series we describe how ElasticSearch plugins help us to address appropriate aggregation levels without having to build in extra round trips (adding to latency), or to fetch more data than we need (which would require filtering in the client). Weiterlesen

Drastic Elastic [Part 3]: Cluster Setup

ElasticSearch does not offer support for clusters spanning data centres. However, on our project we had access to a network latency of 400 *micro*seconds (0.4 ms) between three separate locations in the same city, and decided to test a cluster spanning all three data centres. Network latency did not prove to be a problem, but a more tricky issue was deciding how to set up the cluster to best guard against network partitioning. Weiterlesen

Drastic Elastic [Part 1]: ElasticSearch as a Database

In an article for Java Magazin way back in 2012 (only a small section of it seems to have survived online(!), although it is still available from the inovex website as a download) I toyed with the idea of using a search engine as a database (not such an unconventional idea, it turned out, since Elastic from time to time decribes its search engine as being a database too), mainly due to cost and usability considerations. The idea gained traction with the release of the aggregation framework in early 2015 and a few months later I was involved with a project where we decided to leverage elasticsearch aggregations for the analysis of internet statistics. In this article I want to share my experience. Weiterlesen