What you should do, can do, and should not do in your CI/CD pipelines

Common CI/CD pipelines can consist of an arbitrary number of steps, offering space for a lot of different actions. In this blog post, I will list down what I think should be, can be, and should not be done in a CI/CD pipeline.

What you should do in your CI/CD pipeline

Run Tests
Any serious code repository should come with a group of tests that can be run to verify that the code behaves as expected after changes. This is a very fundamental step and reason that CI/CD pipelines where invented originally. If any test fails, your pipeline must fail entirely.
Build your App
Another fundamental step in the CI/CD world is to build your app from scratch on every change. This ensures sure that you are always ready to build (with all build dependencies available). Most people use Docker or some other form of containers these days, to be able to have a reproducible and easy to ship build.
Ship your App
After the app is built, you should ship the app to some live environment. Typically, you would ship to your dev or staging environment first, and later promote the built to production. Ideally, you should also run your database migrations when shipping your app.
Notify the team
After your pipeline is successful, you can notify your team. I usually prefer to also have notifications on success, just because it is good to know that it is interesting for the team to know that something new has shipped. However, I have seen occasions where people omit success notifications, as they were deemed not useful - if everything goes right, it goes as expected, so a notification would not carry any new information.
However, what you certainly should do is to notify the team of pipeline failures. This is of vital importance since pipeline failures stop the team from shipping updates. It should have very high priority to fix the pipeline and enable shipping of new versions at any point of time. You never know when the next urgent request for a change or fix is incoming.
Notifications could go to your Slack channel, or maybe even via E-Mail or SMS.

These are the very basics of every CI/CD pipeline.

What you can do in your CI/CD pipeline

This is where it gets more exciting. As mentioned above, CI/CD pipelines can consist of an arbitrary number of steps, which leaves a lot of space for creativity.

Here are some example of what I have implemented in pipelines that I worked on:

Security Checks
What I like to have in my pipeline are some extra security checks.
You could scan your container images for vulnerabilities with tools like Trivy or Clair. Those tools will let you know if your image has any vulnerabilities. Clair is also built-in into the AWS Container Registry, and as part of your pipeline, you can query the AWS ECR container scan result to know if your image is vulnerable. But keep in mind, those scanners are only scanning installed system packages, not your source code. Typical findings would include vulnerable system packages (e.g. curl). Always try to use minimal Docker images like Alpine to reduce attack space.
You could also scan your app for vulnerable dependencies. If you just want to know if your app uses vulnerable dependencies, and you use GitHub, you could just use the Dependabot that is built-in to GitHub. Typical findings would include vulnerable dependencies (e.g. Helper Library XYZ v1.0.5 is known to be vulnerable).
If you actually want to scan your source code to find security issues (which is a much more complex task), you could use tools like SonarQube. However, these tools are usually quite expensive, so rather suitable for large organizations. Another problem with actual source-code scanning is that you have to deal with false-positives. In a large codebase, tools like SonarQube can output a large list of potential issues, and it will be up to you to decide if those are real issues or not. Typical findings would find be real security issues (e.g. concatenating SQL queries without escaping input), bugs (e.g. comparing two Strings via == instead of === in Javascript) and code-smells (e.g. empty exception catch block).
Lastly, you could also scan for vulnerabilities in your live environment with tools like OWASP Zap. After you have updated your live environment (e.g. staging), those tools will run tests against your environment, and find potential issues. Typical findings would include missing HTTP Headers, potential Cross-Site-Scripting opportunities and maybe even SQL injection.
None of the tools are perfect, or will find all issues. But except SonarQube, all the tools I listed are free. So why not use them?
With that said, the big issue with security scanning in the pipeline is: what to do if you find an issue? When I initially started implementing security checks into CI/CD pipelines, my obvious first thought was: If there's an issue, I must fail the pipeline. However, I realized that for almost all findings, the issue is not new, but existing. Here is an example: We re-build our app, and update all system packages on the build, and Clair finds an issue with curl <=1.5.0 being vulnerable in our container image. Yes, we use curl 1.5.0 in our image, but that's the latest version. There's no fix for it. And the previous build that we deployed on our last pipeline run also used curl 1.5.0 or even lower. In all cases I have experienced, our currently deployed software is already vulnerable right now. In those cases, it would not make sense to fail the pipeline. How can our team fix a vulnerability in curl? We can just wait for the respective developers to fix it, or we can replace our current deployment with a new version without curl. The same goes for vulnerable dependencies found via Dependabot. Our app usually is already vulnerable. In any case, we must notify the team that vulnerabilities have been found, and evaluate the severity of the issue.

Changelog Generation

When we make changes to our App, it can be interesting to know what has changed exactly. While producing a changelog that can be read by non-technical folks is very hard to automate in my experience, there are things we can automate:

What I have implemented in a previous setup was producing a diff between the previous and the new OpenAPI specs using openapi-diff and sent it to a Slack channel for the team's visibility. You can try it out generating a changelog using my GitHub example

==========================================================================
==                            API CHANGE LOG                            ==
==========================================================================
                            OpenAPI definition
--------------------------------------------------------------------------
--                              What's New                              --
--------------------------------------------------------------------------
- GET    /users/{user_id}
- GET    /users/{user_id}/details

--------------------------------------------------------------------------
--                            What's Changed                            --
--------------------------------------------------------------------------
- GET    /users/{user_id}/friends
Return Type:
  - Changed 200 OK
Media types:
  - Changed */*
Schema: Backward compatible
--------------------------------------------------------------------------
--                                Result                                --
--------------------------------------------------------------------------
                   API changes are backward compatible
--------------------------------------------------------------------------

Integration Tests
By the time we have successfully rolled out our new app build to our live environment, we have already established that our app, in isolation, works, by running our app test suite.
But what if our app has dependencies on other live systems, are they properly integrated into our app? What if the infrastructure configuration for our app is wrong, and leads to errors? Do things really work? To have some more certainty on this, we may want to run some integration or smoke tests against e.g. staging. Tools like Cypress allow us to emulate real user behavior from the browser. We can log in, browse around, and test things, as if we were a real user.
The tricky thing with this kind of integration tests is that our live environments are stateful. Our live environment (e.g. staging) data is persistent, which makes it harder to write tests that can be run repeatedly in each pipeline run. For example: Our integration test logs in a user, adds an item to the list and checks that the item really exists on the list.
Our test passes - so far so good - but what if we run the test another time? Again we log in, add the item - and our test fails with an error: Item already exists. The item we previously created is still there, since our staging environment persists the data.
There are ways around this: we could name the item Name <Unix Timestamp> to avoid conflicts, but it requires great care to write the tests. Another solution would be to spin up an entirely new environment for our integration tests, but in many complex systems, that is not practical (e.g. too many dependencies, dependencies not available to run ourselves etc ...).

What you should not do in your CI/CD pipelines

Don't rebuild your app before production
Once I worked for a project where we would have a pipeline like this:
1. Checkout dev GitHub branch
2. Build app
3. Deploy to staging environment
4. Some manual testing ...
5. Approval by Admin
6. Merge to main
7. Build app (again)
8. Deploy to production environment
Eventually, a senior consultant in this company told the team that step 7 was essentially a bad thing to do.
He argued that between 4 and 7, dependencies might change, the resulting app to be different, essentially rendering the tests in 4 to be unreliable.
What he suggested was:
1. Checkout main GitHub branch
2. Build app
3. Deploy to staging
4. Some manual testing (or better automated testing, if available)
5. Approval by Admin
6. Deploy same build from 2 to production environment
This way, we can make sure that what we deploy is what we tested.
Ever since I learnt this, I have built my CI/CD pipelines this way 😎
Don't do things manually
We have a CI/CD pipeline, so that we do not need to do repetitive, error-prone manual steps. We should embrace this idea and automate EVERYTHING into our pipeline. If the build/deploy process involves manual steps (other than maybe pressing some Approve button), it's not good.
Some things may look hard to automate at first, but generally, it's always worth the effort. It saves time and is way less prone to error.
Never bypass your pipeline
This goes together with an advice given above: If your pipeline is broken, you should fix it immediately. You might be tempted to just roll out the next version manually, after all, what could go wrong? A lot can go wrong! Did you run the unit tests? Did you build it the right way? Did you have the security checks? What if you accidentally run rm -rf / in your production machine? Just never bypass the pipeline!

Summary

CI/CD pipelines are a core concept of modern application development. The goal is to achieve reproduce-able, automated testing, building and deployment, with all necessary steps included. We do not want our developers or infrastructure team to do anything manually after new code is checked into version control.

When applied correctly, CI/CD concepts enable rapid development and feedback loop. Our team can deliver often and reliably, making life better for everyone.