Automated pre-commit code checks with and without Docker

David Rieger
4 min readMay 13, 2017

Continuous Integration/Deployment pipelines are great. All you have to do is click a button and your code will be built, tested and perhaps deployed.

But pipelines may need some time to run. It could take minutes until your CI system even picks up a queuing job. Seeing a pipeline then fail because of a missing colon in one of your yaml files after 10 minutes of waiting is no fun.

If you have a proper IDE or even just a text editor with some nice plugins, you will catch such syntax errors straight away. But the temptation to just quickly open a file in a conventional vi and add a line can often overpower you and that’s when the mistakes happen.

My solution to this are static code checks, performed automatically upon git commit through the use of git hooks.

Git hooks are a great way of integrating repetitive automatable tasks into your development workflow. They are basically nothing more than scripts that are being run when a certain git command is entered, but they can be very powerful.

There are various types of git hooks, triggered by actions such as clone, pull, merge, commit, or push[1]. We will for now be focusing on the pre-commit hook which is triggered by a git commit before the editor for entering the commit message is opened.

Got an example?

Surely. I use Ansible a fair bit and Ansible playbooks are yaml files. Ansible’s CLI offers a check for its playbooks which will validate more than just the yaml structure (but that is out of the scope for this article).

What I wanted now, was for this check to be run for each playbook that has changed since the last commit and is now about to be committed. I wrote a little script[2] that finds all added and modified playbooks and triggers a check for each of them. All I needed to do now was copy this script to .git/hooks/pre-commit and make it executable. Voila, the check now runs before every commit. If it fails it will show me the error and abort the commit.

Ok, I get it, Ansible checks triggered by git hooks, but what has Docker to do with that?

Performance and portability are the key-words.

Instead of running all the checks for the playbooks sequentially, I start a bunch of docker containers and perform the checks inside of them. The playbook:container ratio is 1:1.

This decreases the time needed for checking, and allows me to not wait all too long before I can enter the commit message, so I won’t be tempted to skip the checks.

These are the results from my code-check benchmarks. The orange line shows the amount of time in seconds (y-axis) it took for n playbooks (x-axis) to be checked sequentially without Docker. The blue line shows the same, but with the checks happening in n concurrently running docker containers.

Except for the cases where you only have to perform one or two checks, running containers shows a significant performance improvement. But even for those cases where I only have one playbook, the performance of running the checks without docker is hardly significant.

The second reason for using containers is, that I don’t need to take all too much care of the tools on my machine. If my colleague clones my repo they may not have Ansible installed on the desktop, but if we use containers this doesn’t bother us at all since Ansible will be installed in there.

Is that speed really all just Docker?

Well, yes and no. Unfortunately starting a docker container even in detached mode takes a bit of time until it returns a container ID, which we need in order to be able to get the logs and the return-codes from the checks.

To speed things up a bit, I wrote a tiny go application which will directly access the docker API and combine Docker with Go’s phenomenal multiprocessing capabilities.

Control Flow Diagram of automated Ansible playbook checks triggered by a git commit and executed with docker containers

The bad news

Yes, running these checks in parallel makes things faster. However, a pre-commit check should be something simple and lightweight. Maintaining a separate tool is probably not worth the effort.

I decided to get rid of the go application again and use docker-compose instead. The results are not good unfortunately (you can find the benchmark results in the repository [2]). Especially if we only have one or two playbooks to commit (which, in my case, is most of the time), the sequential checks are significantly quicker than the ones using docker-compose.

The bottom line is: Docker can, if implemented properly, improve the performance of these checks significantly, but there’s a good chance, the complexity of building and maintaining such an implementation outweighs the benefits.

--

--