Ah yes, we’re back. I’ve written about cloud-native CI/CD before on this blog, comparing Spinnaker, Argo CD, Tekton Pipelines and Jenkins X. The reactions were very positive and the post still gets a lot of visits. Reason enough for me to take another look and also to extend my portfolio, so to say.
“While you’re here, can we interest you in some CI/CD?“
This time around, we will not focus on open-source projects. Every cloud provider that makes something of itself has some sort of CI/CD solution adjacent to its compute instances, container offerings and database solutions. Some providers shove that solution into your face, others hide it underneath three layers of other products. The rationale is simple: “You already have a bunch of VMs, managed databases, some serverless functions and that compulsory Kubernetes cluster lying around here, why not also your CI/CD pipeline? We will gladly bill you for it.“ In this particular post we’ll look at the “Big 3“: Azure, AWS and Google Cloud.
My Assumptions About You
Weird headline, I know. But to have some sort of reference point, we should agree on a set of requirements that you probably have. Here’s what I assume:
- There is a software project that you work on, and it’s managed using a Git repository.
- You’re not the only one working on it, and other people need to review and approve of your code before it ships to production.
- “Shipping to production“ involves some sort of build and test step, followed by a deploy step. This could be everything, from an
npm buildthat produces a static web page that is then uploaded via FTP, to a multi-architecture microservice container build that gets canary-deployed into a service-meshed 800-node Kubernetes cluster. You do you.
No matter where you range on that complexity scale, the CI/CD of most software projects can be broken down into those parts: Version control, Code Review, Testing, Building, Deployment. Hence, they will be the dimensions of our comparison.
The Product Matrix
The myriads of service names cloud providers come up with have been the subject of many jokes. Therefore it’s handy to get an overview of the products we’re going to talk about in this post:
|Version Control||DevOps/GitHub Enterprise||Cloud Source Repositories||CodeCommit (CodeStar)|
|Build||Cloud Build||CodeBuild / CodePipeline (CodeStar)|
|Test||CodeBuild / CodePipeline (CodeStar)|
|Deploy||CodeDeploy / CodePipeline (CodeStar)|
Whenever I look at this chart, a certain meme comes to mind:
Let’s look at Azure, GCP and AWS in detail. And in that order.
CI/CD with Azure DevOps
When you start with Azure and ask “How do I deploy my applications?“, they will offer you Azure DevOps. It’s what I like to call a suite, a set of services that integrates seamlessly. There are sub-parts, of course, but everything is coherent and there is one tool for one job. (Sounds logical? Wait until we examine AWS.)
The Repository Part
You probably know that Microsoft acquired GitHub in 2018. This was a strategic development decision and there are signs that the functionality of Azure DevOps will be provided by GitHub Enterprise going forward. Don’t freak out if you just finished building a DevOps project from the ground up. The migration will likely take several years and Microsoft will do everything they can to make the transition as smooth as possible for their paying customers. However, this puts me in a weird spot: Do I tell you about the pros and cons of Azure DevOps, knowing well that this information will be obsolete in a few years?
I decided to leave this to someone else. Lars Klint wrote a concise, on-point comparison between Azure DevOps and GitHub. The gist: Azure DevOps Repositories are enterprise territory, while GitHub is the place for the cool open-source kids to hang out. Nonetheless, going enterprise will cost you no matter what you choose. If you’re already partially on the Azure platform and are looking to migrate your code, you should closely observe how Microsoft will handle the transition in the months and years to come. I’m sure it’ll get interesting.
Test, Build and Deploy with Azure DevOps Pipelines
Let’s assume you created an Azure DevOps project. Out of the box, you have access to:
- Azure Boards, to manage to-dos and tickets in an agile way,
- the aforementioned Azure Repositories to store and review code in,
- Azure Pipelines, to create step-based CI/CD pipelines,
- and Azure Artifacts to save the results of your pipeline runs.
It’s important to note that Azure lets you mix and match, meaning you don’t have to use all of these services. As we already talked about repositories, so let’s focus on pipelines.
There is a bunch of terminology for Azure Pipelines, but there’s nothing surprising about it: There are pipelines that consist of stages, which consist of tasks which in turn consist of steps. Like with all other providers, you manage the details in a YAML file that you can commit to version control. There is also an extensive list of built-in tasks that Azure provides for you so you don’t have to reinvent the wheel. The deploy targets can be whatever you like: VMs, container registries, storage buckets and more.
Gitlab is the tool we at inovex have the most experience and fun with, so it’s our choice for doing benchmarks. Azure Pipelines configuration files are generally more verbose and nested than their Gitlab counterparts, while providing the same feature set. Noticeably, Gitlab has had a couple of years of a head start to smooth out corners and edges. This is evident especially in the UI.
Using Google Cloud for CI/CD
In our overview chart, we listed Google Cloud Source Repositories and Google Cloud Build. Google admittedly has other services for CI/CD, e.g. Anthos. Anthos however kills a lot more than two birds with one giant stone, and its range of service (hybrid cloud environments, infrastructure management, policy enforcement, facilitating application development, to name a few) is out of scope for this post. It’s on our radar, though, and we’ll probably address it in a future blog post.
Won’t Get More Basic: Google Cloud Source Repositories
GCSP, as I will call the service from now on because its name is just to dang long, is a remote git backend in the most basic sense. You can push to and pull from it. Not more, not less. In the GCSP docs, Google heavily promotes the ability to mirror repositories from GitHub and Bitbucket. The key takeaway is that you really don’t want to use GCSP as your primary version control system: Code reviews on feature branches are rather important and not possible with this service. It’s just not intended to be used that way.
Google Cloud Build
Cloud Build is Google’s serverless CI/CD platform, at the center of which is – as usual – a declarative description of the steps you want to execute on said platform. The service comes with a limited set of around 20 official cloud builder docker images. These official images support most common tool chains and include bazel, docker, dotnet, go and npm, among others.
On top of those official images, there are community images for a lot more projects – the full list is available on the cloud builder community repository. However, GCP doesn’t provide pre-built containers for those images. You have to clone the spec you need, build the image yourself and push it to the container registry of your project before usage. If the use case you need is not supported out of the box or if you don’t want to rely on third-party code in your CI/CD pipeline, you can create custom build steps in the form of a hand-written Dockerfile.
The comparison with Gitlab as our benchmark yields a mixed result: The config files are relatively concise, but at the same time limited compared to the features of Gitlab CI. Nonetheless, most of the standard use cases are covered and can be implemented without too much hassle. If you’re interested in the possibilities in detail, see the build configuration overview.
The Zoo That is CI/CD on AWS
As I mentioned earlier, Azure has a service suite. Google is half-way there. AWS has more of a toolbox. Everything you need is in there, but you have to find it first. AWS knows about this complexity and is eager to fix it. We’ll look at CodeStar at the end of this section. It’s your go-to service if you just want to get started with CI/CD on AWS.
CodeCommit—Version Control by AWS
AWSs answer to “where should I store my code base?“ is CodeCommit. Unlike GCSP, CodeCommit supports pull request workflows like you’re used to from GitHub or GitLab. It’s just not as pretty as them. This might be a rather minor inconvenience, but if you spend multiple hours a week in discussion threads, this is an important consideration.
Overall, CodeCommit feels very “enterprisey“, e.g. it promotes its adherence to a lot of compliance programs. My favourite statement on CodeCommit comes from Corey Quinn:
If someone suggests you go all-in on AWS and implies that this means using Amazon Chime, WorkDocs, and CodeCommit, that person is actively attempting to sabotage you and you should stop reading this and call corporate security immediately. [Source]
That’s just an opinion of course, but you can see where he’s coming from. What to do when on AWS anyway? Head to the conclusion of this article! To end on a positive note here: You still have the option of hosting your code on GitHub or Bitbucket and simply mirroring it to AWS. Additionally, CodeCommit still has the benefit of integrating well with other AWS services. Like for example …
CodeBuild and CodeDeploy
While build and deploy stages are just two steps in the same pipeline for Azure and GCP, AWS takes a different approach and presents two distinct solutions: CodeBuild and CodeDeploy.
CodeBuild is a step-based, fully managed build service. From you, it needs two things: A source code repository (might be a CodeCommit or a GitHub repo), and some instructions on how and what to build. These instructions come in the form of a buildspec.yaml file in your repository. The resulting artifact can be stored in an S3 bucket or pushed to a container registry. A nice touch is the local build option which allows you to test your pipeline locally before committing it.
Afterwards, you want to deploy your freshly built app. AWS CodeDeploy enables you to do that by offering a list of compute platforms that you can deploy to, like EC2, Lambda or ECS. It provides different deployment strategies out of the box, e. g. canaray, linear or blue-green deployments.
“Makes Sense to Combine Those“ aka CodePipeline and CodeStar
Usually you want to build and deploy, both. AWS knows this and created CodePipeline as a solution to integrate all the common steps into one product. As you can see in the list of possible CodePipeline integrations, CodeBuild and CodeDeploy are first-class citizens, but you can also sprinkle in GitHub and Jenkins. Thus, CodePipeline is more than just a combination of CodeBuild and CodeDeploy.
And as if all that wasn’t enough, AWS went up one more abstraction layer and made CodeStar. This tool is essentially a catalogue of well-thought-out CI/CD recipes that come with sane defaults. Want to deploy a static web site? Here you go. A Java app? Have this blueprint. And so on. Under the hood, CodeStar creates CodePipeline projects with CodeBuild and CodeDeploy modules as well as a CodeCommit repository if you want that, but it doesn’t bother you about it. That’s actually a nice approach if you’re new to AWS and need to get started fast. After creating the CodeStar project and gaining some experience with it, you can tweak the settings on lower service levels to your liking.
If you want to storm off and create a CodeStar project right now, that’s okay. Just keep in mind that you actually get four products for the price, of, well, four.
Conclusion: Which one is right for me?
No one of the cloud providers will actively force you to use their accompanying CI/CD products. However, of course they don’t hesitate to advertise how well they integrate with their core services. An important thing for me to learn was that oftentimes, cloud providers don’t offer a service because they’re convinced that it is the best possible solution for a customer problem, but because “one just has to have it“. That explains why AWS clocks in at around 130 service offerings and why products like Google Cloud Repositories exist. They’re there, sure, but even Google is like “Just use a GitHub repository, alright?“
I assume that it’s rather rare that you as a developer can simply pick your favourite cloud provider. Most likely your company is already using one of them, or other departments like finance will also weigh in on the decision. So the question is less which cloud provider to pick for your CI/CD, but rather if you should use the services of the one you’re
forced to allowed to work with over third-party products.
Third-Party Tools to the Rescue?
We haven’t touched on third party tools in this post all that much. Of course you can always treat the public cloud as a low-level infrastructure provider, get some VMs and host the CI/CD tools of your choice on them. Depending on your situation, e. g. if you need to execute tests or builds that are not feasible to run inside of containers, this might make sense: Host a private GitLab instance and mount custom runners, e. g. to support builds on niche processor architectures.
If it has to be a custom CI/CD tool over which you have full control, a personal favourite of mine and a lot of my colleagues is the aforementioned GitLab. It subsumes the entire CI/CD stack into one coherent product. The creation of pipelines is simple, yet powerful (complex pipelines can get overwhelming to maintain, though). Most certainly, you have experience with or heard of some of the other tools in the area like Jenkins, CircleCI, Travis, or more Kubernetes-native ones like ArgoCD, Tekton or Prow.
What about Vendor Lock-In?
It’s tempting to think that having your entire CI/CD stack running on a vendor-agnostic setup is somehow freeing. Cloud provider A increases their Kubernetes pricing?* Just lift and shift all your Tekton pipelines over to a cluster on cloud provider B. The bad news is that these scenarios are rare and almost never worth the extra effort. I already quoted Corey Quinn earlier, and his take on “Multi-Cloud as a Bad Practice“ is a worthwhile read. I’ll paraphrase the key takeaway: While you configure a set of VMs to run Jenkins or a Kubernetes cluster to execute Tekton pipelines, a direct competitor of yours has chosen one of the preconfigured build tasks available on all of the three platforms we looked at and moves on to work on a product feature. The overhead of a completely vendor-agnostic setup might not be worth it!
*Highly unlikely, by the way. Cloud service pricing tends to stay stable or even go down.
To put this entire post in a nutshell: What you specifically should do in terms of CI/CD is highly dependant on a few aspects:
- Are you building the CI/CD stack for a single product/project or do you have to establish a platform service with extensive onboarding and role management capabilities?
- Do you have special requirements for your build pipeline, like rare processor architectures or GPU dependencies?
- Do you—for whatever reason—need to be fully provider-agnostic?
It is often possible to integrate and deploy applications to the cloud provider of your choice with a myriad of vendor-specific tools, external tools, or combinations of them. Some integrations are more common and well-supported, others behave weirdly and have a high potential for frustration. Check the appropriate documentation of your cloud provider or search for experience reports for the setup you have in mind. If you’re still not sure, reach out to us. We’re happy to help!