In this blog post, I want to focus on the requirements for the software stack to develop and maintain an embedded product.
I have been an Embedded Software Developer for multiple years now. From time to time, I find myself in a discussion with other developers or product managers about a cool prototype they are building, with a Raspberry Pi or some Arduino board. These are hardware platforms that became very popular in the maker and hobbyist scene in the last decade and have made it really easy to tinker with hardware, e.g. attaching sensors, sending the data to the cloud or controlling your lights. Nearly all my private hobby projects are based on them. They have reduced the entry barrier into the embedded field which is great. More people are doing it now and building cool stuff.
However, these systems are not suitable for building and maintaining products over a long period of time. For me, it often boils down to the distinction between tinkering and engineering. Both are valid use cases but impose different requirements on the hardware and software.
What is an embedded system?
But first, let’s talk a bit about the context. Embedded systems are not general computing devices like servers, notebooks or desktop PCs, but are nevertheless all around us or, to be more precise, in a lot of devices around us. They are built into all different kinds of products and fulfill a specific purpose. You often would not even expect a full-blown computer in one of them.
They are in alarm clocks, mobile phones, cars or digital cameras, but also dishwashers, coffee machines and industrial robots can contain them. Furthermore, all the new IoT (Internet of Things) sensors and devices match this category. Technically, they range from small 8-bit microcontrollers, also called MCU (microcontroller unit), with 2 KiB of RAM, as found in the Arduino boards, to 32-bit SoC (Systems of Chips) with 256 MiB of RAM, used in internet routers or eBook readers, all the way to powerful 64-bit multicore SoCs with 12 GiB of RAM used in current smartphones and similar high-end devices.
What is an embedded build system?
Let’s focus on the software that runs on an embedded system. It is usually called firmware and is developed alongside the hardware of the product. The firmware image is flashed in the factory’s assembly line and in the lifetime of the product the manufacturer often updates the firmware multiple times with new releases for additional features or bug fixes.
To develop such firmware, you can use an embedded build system. It combines the software that runs on the device with the needed tools to build and debug it, like compilers, build tools, flashing scripts and debuggers. Sometimes they even include their own code editors or IDEs (integrated development environment).
Build systems come in a variety of different sizes and capabilities. Build systems to develop your software directly on the device (bare metal) are, e.g., the Arduino IDE and PlatformIO. They often do not include an operating system, only libraries to bring up the microcontroller and to access its peripherals. They can be used for simple devices that read just a few sensors and control some outputs. For more complex controlling tasks or more connectivity, you rather want to use a real-time operating system. Examples are Zephyr and FreeRTOS. They contain a kernel that implements task switching to separate different responsibilities in the firmware and provide additional abstractions like drivers to make the software more portable.
If the microprocessor features an MMU (memory management unit), you can also choose a full-fledged operating system like Linux. The kernel provides real process isolation and a well-known programming interface. Common embedded Linux distributions which are based on Linux are Buildroot or Yocto. For high-end devices, in embedded terms, there is also the AOSP (Android Open Source Project) which powers smartphones, car entertainment systems, other multimedia devices and targets in more complex environments.
What I want!
After describing the context of embedded development and explaining the common terms we dive into actual requirements now. For me, the following points are essential to develop and maintain a product professionally.
I want version control for all the things
When I make modifications to the firmware, e.g. installing an additional software package, tweaking a config parameter or patching software, I want to track these changes in a source control repository. Nowadays, you mostly use a git repository and it provides perfect traceability. It nicely tracks your development history and all the changes between different versions of your firmware. Years from now, you can look back and exactly recall which developer has made what change and hopefully the commit message contains a good explanation of why the change was necessary.
Apart from standard git techniques like tagging to create releases and branching to maintain previous firmware versions, this also allows code reviews on these firmware changes. Nearly all git hosters, like GitHub, GitLab, bitbucket or Gerrit, support some sort of review feature for code changes like pull/merge requests. This allows your co-workers to catch bugs early and it increases transparency in the development process.
Actually, this requirement is not specific to embedded development. It is a very, very basic requirement for every good software project.
I want to build the firmware in the continuous integration system
Continuous integration (CI) is a software development practice. A server is building the software on every source code change and runs additional verifications and tests. This guarantees that the project is always in good shape and catches simple and common developer errors, like non-compiling code or uncommitted files. It is especially important when multiple developers are working on the same project.
Running a CI system is a standard feature for every software project nowadays. In 2019, I gave a whole talk about this topic and showed how we build CI pipelines for three different types of embedded projects: CI/CD in Embedded-Projekten. Sadly it is not adopted in all projects yet, hence the above-mentioned benefits.
If you have, for example, a Windows-only embedded IDE (Integrated development environment), it may not support continuous integration. Then it is unsuitable for professional development in my opinion.
The best way to support continuous integration is to provide a command-line interface that can be used in the CI setup but also by the developers themself for example to script the build and flash process.
I want a reproducible firmware build
The term reproducibility is well-known from the Reproducible builds effort. The goal is to have a software build process that allows two different parties to build the same source code and have bit-to-bit identical binaries as a result. This property should also hold for two different points in time which means that the build process should ensure that today and in two years the results are identical. It is also an important tool for software security because it avoids the need to trust the party that compiled the binaries from the source code.
Having real reproducible builds is worthwhile, but in this case, I do not aim for such a strong guarantee. For this requirement list, it is sufficient when the build produces functionally equivalent binaries. This means that the binaries can have differences like build timestamps, different file order in directories or other build-environment-related changes like usernames.
This requirement has some implications. For example, all external resources, like source code tarballs downloaded from the internet, must be verified by cryptographic checksum before use. These checksums must be built into the build system, so they are not changeable without being noticed. Linux distributions and embedded build systems have done this for decades but newer software ecosystems sometimes do not (by default).
Building software is a complex task in itself. For example, incremental builds which may speed up our development cannot guarantee reproducibility, because of stray files from previous builds, byproducts or different build orders. When building the final firmware image, you should always do a clean build, which means starting with a fresh copy of the source code and an empty build directory.
I want to fix bugs in all software components
I call this requirement full-stack patchability. This means, whenever I encounter a problem, like a bug or missing feature, I am able to look at the source code, understand it and then fix the issue by writing a software patch. I want to have the source code for flashing tools, the bootloader, the kernel, libraries, programs and also the compiler toolchain. Yes, sometimes you even have to backport a fix because you encounter a compiler bug. The patchability requirement does not only include the software running on the device but also build tools running on your developer machine. Debugging or adding features to the build process must be possible.
Why is this needed for embedded software? The hardware is much more diverse than in the desktop or server market and mostly custom-made for the product. Sometimes you have to write your own drivers for external peripherals because nobody used it in Linux before, or you have to debug a complicated problem that the rare combination of your hardware and software causes. With the source code available, it is much easier and quicker to understand and find the root cause than to email back and forth with the support service of a vendor.
I often say that there are less well-trodden paths in embedded software development. You get into the muddy ground very quickly and without a clear sight (on the code) it is hard to get out. So in my opinion, open-source, or at least available source, is a requirement for an embedded build system.
I want to fix bugs in old firmware images
It is not uncommon that the product lifetime of an embedded system is ten years. As a consequence, you also have to maintain the software of the device for many years. One already mentioned requirement is crucial for that: reproducibility. Building a functional equivalent firmware image from the identical source code must be doable now and in the future. The obvious case is security fixes. You must be able to fix security bugs over the whole lifetime of your product, even if the last firmware update was done many years ago.
Another example is discontinuations for hardware components. If the flash chip in your product is not available anymore, you have to replace it with another charge or even a similar but slightly different part to continue the production of your device. But hardware changes sometimes require updating the software, for example, to recognize the new flash chip, too.
Being able to rebuild and to fix bugs in old firmware images is an important requirement for an embedded build system. Apart from reproducibility, build systems have some additional features helping in this regard – for example, archive support and offline builds. It allows you to download all the required source code tarballs from the internet, store and archive them locally and build a firmware image without internet access. See Buildroot’s documentation. Yocto also supports having additional source code mirrors if third-party servers are unavailable or gone.
Some build systems, like the AOSP, go even a step further. They just include the whole source code in itself. They do not need to download additional software from the internet.
I want to be license compliant
License compliance is often associated with open source licenses, but in embedded projects, there sometimes are proprietary software licenses, too. The build system must be able to generate a so-called bill of materials. A list of all the software components including version numbers and licenses that are built into a firmware image. This can be used to generate a combined license document. Providing such a document, e.g. in the settings menu of the product, makes it straightforward to be license compliant – for example, the Apache 2.0 License requires such a notice if the product contains code licensed under this license.
Reproducibility together with an open-source build system is also key to comply with the GPL License. It requires you to disclose the “corresponding source“ and the “scripts“ for the GPL licensed source code in your firmware image. Since the build system is often open source anyway, you only have to strip your closed source components and have the corresponding source including scripts, the build system, that you can distribute.
I want a separation of code and state
For small microcontrollers, like the ATmega328 used in Arduino boards, the separation of code and state is a technical property. These chips have flash storage for the code and static data and an EEPROM for the state and configuration data.
For bigger embedded systems, running an operating system like Linux, the common approach is to have a separate partition for the state. For example, in Android, this partition is called the data partition. This separation is uncommon in traditional Linux distributions. There, the applications (mostly in the folder /usr/), the configuration (in the folder /etc/) and the state and log files (in the folder /var/) are put into the same file system.
But the separation has a couple of advantages for embedded systems. First of all, it provides you with a simple factory reset mechanism. Since all the state is bundled into a single partition, you only have to wipe that partition. All the configuration and state are gone and on the next boot, the device can show a setup screen again.
However, the code and static data, which is in the root file system, can be protected by mounting the file system read-only. This improves the reliability because a bug in a user space application cannot corrupt the firmware by accident. Additionally, it is the base for a reliable update system which we will discuss below.
Power cut safety
Furthermore, when designing embedded systems, you sometimes cannot rely on users to power off the device normally. They just unplug the power from the device and protection against this scenario is called power cut safety. Achieving this goal can be quite tricky because file system corruption can happen when a write operation is interrupted. In the worst case, the device does not boot anymore because the power cut corrupted some important files. A read-only root file system improves the reliability here because it disallows all write operations on the file system altogether.
Another aspect is security. A read-only root file system is one building block for a verified boot chain. For example, you can use dm-verity to check the integrity of a read-only file system.
I want to provide an SDK to application developers
Apart from simple devices, there often is a whole team of developers working on a product. It usually consists of embedded engineers working on the kernel, bootloader and other OS internals and the application developers building the business logic and UI (user interface) of the product. The latter group does not need to build the whole firmware image. They only require a toolchain and other tools to build and deploy their application onto the device. This toolchain and related tools are called SDK (software development kit). It allows the developers to cross-compile their applications on their powerful x86 notebook or desktop machines. It is faster than compiling on the device itself and often it is not even possible to have a full development environment on the product because of storage and memory constraints.
Cross-compiling has its own hurdles and pitfalls, but in my view, it is a basic requirement for a build system to increase the productivity of the developers. All major embedded Linux distributions support it: See Yocto’s developers‘ documentation on Building an SDK Installer or Buildroot’s section about Building an external toolchain with Buildroot.
I want a robust update mechanism
Apart from maybe very small, narrow and well-defined use cases, you need to have an update mechanism in place. When your devices are in the field, you want to ship firmware updates to fix security issues and bugs or to add additional features. There are multiple ways you can provide updates. For example, the device automatically downloads the update from the internet, which is called an OTA update (Over the air), or the user plugs a thumb drive with the update file into the device or the user uploads an update to the device’s configuration website. A lot of scenarios are possible.
Apart from the means of distribution, the update mechanism must be secure and safe. An attacker should not be able to install malicious code and an update should not be able to brick the device, e.g. if a power outage happens in the middle of the update process. This requirement is often called atomic, which means that the update system must support a rollback and a recovery mechanism in case of failures. Sending out technicians to fix non-booting industrial machines that you just bricked by an update is expensive.
What about security?
You may wonder why I left out security in my requirement list for an embedded build system. Is it not a top priority in a product? Are there not enough insecure IoT devices?
First of all, security is not the only goal. You also want safety, which means that the device should work reliably and do not cause harm. And most often security and safety are two diametrically opposed goals and hard to combine in embedded products. If you want to learn more about this natural tension between these two, I recommend the presentation Safety vs Security: A Tale of Two Updates by Jeremy Rosen. It brings the topic to the point.
Back to security. Why did I not include it? To use the words of the well-known expert Bruce Schneier: “Security is not a state, it’s a process“. To have a secure product, you have to embed it into the development and maintenance process of your device. You have to design the device with suitable protections for your thread model upfront, develop your software stack with security issues in mind and provide updates for the lifetime of your product. Security cannot be finished. It is not done at some point.
So in my view, the above requirements for an embedded build system are the baseline for a development process that takes security seriously. You cannot claim to have a secure product without those.
Summary
To recap the blog post I have briefly listed all the requirements that I want from an embedded build system:
- a source control system, code review
- continuous integration, command-line interface
- reproducible builds
- full-stack patchability, open-source
- source archives, offline builds
- license compliance, bill of materials
- separation of the code and the state
- SDK, cross-compilation
- update mechanism with rollback and recovery
In a nutshell: Traditional embedded Linux distributions, like Buildroot and Yocto, support these requirements. Server and desktop Linux distributions are generally not.
Take Raspbian for example, a Debian variant that is used on the popular Raspberry Pi. It allows you to quickly and easily work with the device, e.g. installing new software or accessing the pin header, but it falls short for maintaining it in the long run. The package-based update mechanism with apt-get is not atomic and not reproducible across multiple devices. It does not allow you to record the firmware changes in a git repository. There is also no separation of code and state (read-only root filesystem) and many more things.
Another example is the Arduino IDE which can be used to develop the Arduino boards that are very popular in the Maker scene. In the first years, it did not support continuous integration. Only later, they added command line support. See the blog post Announcing the Arduino Command Line Interface (CLI) in 2018.
This is why we always suggest using a build system like Yocto or Buildroot to our customers for their products. It may seem to be too much work and too complicated after they have completed their prototype but it will avoid a lot of headaches and problems in the long run.