Ab Februar 2025 Pflicht (EU AI Act): Anbieter und Betreiber von KI-Systemen müssen KI-Kompetenz nachweisen.
Alle Informatonen 
Gopher Testing Kubernetes
DevOps

Testing Your Kubernetes Infrastructure

Lesezeit
7 ​​min

Automated tests in software development are widely regarded as an integral quality measure. In our experience, the same cannot be said for testing infrastructure. We believe that there is too little information on how to utilize and extend existing tools to test Kubernetes clusters, so we’ll share our experiences in this article.

Introduction: Why Test?

There are many good reasons to utilize automated cluster tests. Kubernetes is a very complex system composed of multiple (more or less) independent components. Their successful interplay in one configuration does not necessarily imply that they will work flawlessly in every Kubernetes setup.

There are various ways of bootstrapping a Kubernetes cluster, either by using an installer such as kubeadm or kops or one of the other ways to set up a cluster. For each of these ways there are many configuration options, making no two setups look alike. Testing our clusters before users work on them ensures that our setup works as intended and in turn gives the users a valuable platform. Cluster setup is only the first step though, as Kubernetes updates are released frequently. Tests can aid here too, by verifying that basic cluster functionality is still available after an update. This leaves us with many ways our individual cluster could diverge from the setups tested by Kubernetes itself.

Kubernetes is more than a platform, though, it’s also a framework meant to be extended with plugins and add-ons. As such, no Kubernetes setup is necessarily intended to work the same way. Kubernetes’ own tests don’t include plugins, so test coverage relies solely on the plugin-developer’s tests. To ensure that our plugins do not interfere with each other or Kubernetes, we should also test them in our setups. This is especially true for plugins you’ve written yourself.

Conformance Tests

Kubernetes conformance tests are the subset of Kubernetes end-to-end (e2e) test cases for testing core Kubernetes features. More specifically they “currently test only [Generally Available], non-optional features of APIs“, as stated by the responsible developer group. A cluster that passes these tests is called conformant and can be certified as such by the CNCF k8s Conformance Working Group.

Tested features currently include the ability to create API objects, launch containers on nodes and mount basic volumes as well as tests for kubectl. Not included are optional features such as Role Based Access Control, NetworkPolicy and PodSecurityPolicy. Plugins and add-ons are also mostly exempt from conformance tests, e.g. DNS is tested but networking using plugins such as Weave or Calico is only implicitly required by some tests. In the future plugins might also be tested via conformance test profiles, but for now they need to be tested separately.

Their verification of basic cluster functionality nonetheless makes conformance tests an ideal starting point for testing our clusters. To run them, we can use a tool such as kubetest or sonobuoy.

kubetest

Kubetest is the CLI tool used in Kubernetes pipelines for running the e2e tests. As conformance tests are a subset of the e2e tests, you can use kubetest to run them on your cluster by filtering the tests to be run:

This runs all tests tagged with [Conformance] after downloading and extracting the required Kubernetes binaries for our cluster version. The test runtimes can also be shortened by running tests that support it in parallel:

You can also run kubetest extract only once and then execute the tests from the Kubernetes directory to speed up subsequent runs. For debugging purposes you can also tell kubetest not to delete namespaces of failed tests:

While highly customizable, kubetest is not necessarily intended for end-users, demonstrated by the flags being scarcely documented and often confusing. For simply running conformance tests, there is Sonobuoy by Heptio (now VM Ware), which simplifies this process.

Sonobuoy

Sonobuoy is a diagnostic tool which can, amongst other things, run the Kubernetes conformance tests. It consists of a CLI which starts a pod managing the test run inside your cluster and lets you retrieve the results afterwards. It is a simple out-of-the-box solution and the standard tool for running the conformance tests. Sonobuoy comes with some customization options, such as providing custom repositories for test images, e.g when testing air-gapped clusters.

We can also choose to run other tests of the e2e test suite with both kubetest and Sonobuoy to test some of our plugins. For example, if we want to use network policies in our cluster, we should probably test whether they are enforced. A basic test for this can be run with Sonobuoy like this:

These tests create basic network policies and pods restricted by them and verify whether they are enforced in the cluster (note that they do not verify that the policies that exist in the cluster work as intended, to do so a tool such as netassert or illuminatio can be used). Similar tests exist for other features.

Write Your Own e2e Tests

You can also write your own e2e tests for your cluster setup. This is especially useful when running homegrown add-ons, as unit tests can hardly mimic the behaviour of a running Kubernetes cluster. To develop your tests in Golang you can re-use the e2e framework of Kubernetes itself, as documented in this blog post.

If you use another programming language, you can still use the kubernetes client libraries, but you will have to write some boilerplate code yourself, e.g. for setting up and tearing down test namespaces. A unit-test framework such as pytest will still prove useful to separate the setup from your test cases and to run tests and collect results.

Whether you just started your Kubernetes journey or you’ve been running clusters in production for 5 years, we believe you should start testing those clusters now. Run Sonobuoy for conformance tests in your pipelines, start throwing in some e2e tests for the features you use and develop your own tests for that component that caused failures in your cluster one too many times. It will make operations easier and give you some peace of mind.

Hat dir der Beitrag gefallen?

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert