When managing one or more clusters, the question arises as to how cluster configurations and applications can be installed securely, regularly, and in the same way.
This is where the so-called GitOps approach helps, according to the mantra: "If it is not in Git, it does not exist".
The idea is to have Git as the only source of truth on what happens inside the environment. While there are many articles about how to get GitOps into the deployment process of applications, this series of articles tries to set the focus on the cluster configuration and tasks system administrators usually have to do, for example: Setup an Operator.
In this series, I will focus on cluster configuration.
The GitOps approach is a very common practice and the-facto "standard" as of today.
When I write standard, then be assured, that the approach itself should be followed, but HOW this is done can be a topic of many tough discussions.
But what is it and why should a company invest time to follow this approach?
_GitOps is a declarative way to implement continuous deployment for cloud-native applications. It should be a repeatable process to manage multiple clusters.
GitOps adds the following features to company processes:
Everything as code: The entire state of the application, infrastructure and configuration is declaratively defined as code.
Git and the single source of truth: Every setting and every manifest is stored and versioned in Git. Any change must first be saved to Git.
Operations via Git workflows: Standard Git procedures, such as pull or merge requests, should be used to track any changes to the applications or cluster configurations.
It is important that not only the manifests of the applications but also the cluster configuration is stored in Git. The goal should be to ensure that no manual changes are made directly to the cluster.
Benefits/Challenges
Deploying new versions of applications or cluster configurations with a high degree of confidence is a desirable goal as getting features reliably to production is one of the most important characteristics of fast-moving organisations.
GitOps is a set of common practices where the entire code delivery process is controlled via Git, including infrastructure and application definition as code and automation to complete updates and rollbacks.
GitOps constantly watches for changes in Git repositories and compares them with the current state of the cluster. If there is a drift it will either automatically synchronise to the wanted state or warn accordingly (manual sync must then be performed).
The key GitOps advantages are:
Cluster and application configuration versioned in Git
Visualisation of desired system state
Automatically syncs configuration from Git to clusters (if enabled)
Drift detection, visualisation, and correction
Rollback and roll-forward to any Git commit.
Manifest templating support (Helm, Kustomize, etc.)
Visual insight into sync status and history.
Role-Based access support
Pipeline integration
Adopting GitOps has enormous benefits but does pose some challenges. Many teams will have to adjust their culture and way of working to support using Git as the single source of truth. Strictly adhering to GitOps processes will mean all changes will be committed. This may present a challenge when it comes to debugging a live environment. There may be times when that is necessary and will require suspending GitOps in some way.
Some other prerequisites for adopting GitOps include
Good testing and CI processes are in place.
A strategy for dealing with promotions between environments.
Strategy for Secrets management.
Used Tools
The following list of tools (or specifications) are used for our GitOps Approach.
Classic Kubernetes/OpenShift offer a feature called NetworkPolicy that allows users to control the traffic to and from their assigned Namespace. NetworkPolicies are designed to give project owners or tenants the ability to protect their own namespace. Sometimes, however, I worked with customers where the cluster administrators or a dedicated (network) team need to enforce these policies.
Since the NetworkPolicy API is namespace-scoped, it is not possible to enforce policies across namespaces. The only solution was to create custom (project) admin and edit roles, and remove the ability of creating, modifying or deleting NetworkPolicy objects. Technically, this is possible and easily done. But shifts the whole network security to cluster administrators.
Luckily, this is where AdminNetworkPolicy (ANP) and BaselineAdminNetworkPolicy (BANP) comes into play.
Lately I came across several issues where a given Helm Chart must be modified after it has been rendered by Argo CD. Argo CD does a helm template to render a Chart. Sometimes, especially when you work with Subcharts or when a specific setting is not yet supported by the Chart, you need to modify it later … you need to post-render the Chart.
In this very short article, I would like to demonstrate this on a real-live example I had to do. I would like to inject annotations to a Route objects, so that the certificate can be injected. This is done by the cert-utils operator. For the post-rendering the Argo CD repo pod will be extended with a sidecar container, that is watching for the repos and patches them if required.
The article SSL Certificate Management for OpenShift on AWS explains how to use the Cert-Manager Operator to request and install a new SSL Certificate. This time, I would like to leverage the GitOps approach using the Helm Chart cert-manager I have prepared to deploy the Operator and order new Certificates.
I will use an ACME Letsencrypt issuer with a DNS challenge. My domain is hosted at AWS Route 53.
However, any other integration can be easily used.
During a GitOps journey at one point, the question arises, how to update a cluster? Nowadays it is very easy to update a cluster using CLI or WebUI, so why bother with GitOps in that case? The reason is simple: Using GitOps you can be sure that all clusters are updated to the correct, required version and the version of each cluster is also managed in Git.
All you need is the channel you want to use and the desired cluster version. Optionally, you can define the exact image SHA. This might be required when you are operating in a restricted environment.
Argo CD or OpenShift GitOps uses Applications or ApplicationSets to define the relationship between a source (Git) and a cluster. Typically, this is a 1:1 link, which means one Application is using one source to compare the cluster status. This can be a limitation. For example, if you are working with Helm Charts and a Helm repository, you do not want to re-build (or re-release) the whole chart just because you made a small change in the values file that is packaged into the repository. You want to separate the configuration of the chart with the Helm package.
The most common scenarios for multiple sources are (see: Argo CD documentation):
Your organization wants to use an external/public Helm chart
You want to override the Helm values with your own local values
You don’t want to clone the Helm chart locally as well because that would lead to duplication and you would need to monitor it manually for upstream changes.
This small article describes three different ways with a working example and tries to cover the advantages and disadvantages of each of them. They might be opinionated but some of them proved to be easier to use and manage.