Getting started with automated quality assurance

The bad news is that every discussion we’ve had about tabs vs spaces was a waste of time, and neither of us learned anything useful. The good news is that we won’t need to have that discussion ever again - there are tools which apply idiomatic indentation for pretty much any language under the sun. But the good news don’t stop there. That discussion about whether entries in a long lists should all be on a single line, broken into long lines, or broken into one item per line? There’s a tool for that. Newline at end of file? There’s a tool for that. How imports should be sorted and grouped? Detecting overly general catch statements, unused variables, or bad spelling? You guessed it. In this article I’ll go through why you might want to automate QA, and some practicalities of how to adopt automated QA tools, so that the team can concentrate on the actually important work.

Why

Do any of the following apply to you?

Participated in a discussion about tabs vs. spaces.
Found a bug in production which would’ve been obvious at review time if the code had been properly formatted.
Created a function which had more cyclomatic complexity than the London Underground.
Committed a file with a syntax error.
Committed a syntactically valid file with schema errors. For example, a JSON file with a $schema property, or an XML file with an schemaLocation attribute.
Committed a spelling mistake, risking making a future git grep miss important information, or looking bad in front of clients.
Committed a script with the wrong line endings, causing incomprehensible runtime errors.
Received a complaint about misleading, overly complex, or non-idiomatic code.
Silently cursed someone, maybe even yourself, for using such weird formatting.

You see where this is going - all of these can be avoided by using tools which are widely available, robust, and generally have excellent defaults.

Why not?

More maintenance burden, making sure the tools keep working,
Each tool has quirks.
Some tools use your least favourite configuration format.
Some big commits. (The next section discusses ways to minimise that.)

Strategy

tl;dr: Start early, improve steadily.

When learning a new set of tools it’s not a good idea to try to adopt all of them at once. Automated QA tools are still a relatively new phenomenon, so conventions are still being developed, and learning how to use one tool does not usually translate into an easier time with the next one.

Make sure to get buy-in before introducing a tool. If someone is sceptical, try a quick demo to see if they like it. One important thing to keep in mind at this stage is that today’s tools are generally extremely robust. Gone are the days when auto-formatting a piece of code had a significant chance of breaking it. Of course I can’t speak for all tools, but anything mainstream has almost by definition been tested on thousands of projects already.

After introducing one tool I’d recommend looking for another one which is a small but useful step forward. Any time we find some particular part of development tedious, chances are someone has developed a tool to avoid most of that work. A quick search for “CI [the task]” (without quotes) in a search engine should find something relevant. For example:

“CI Rust linter” finds Clippy
“CI Bash formatter” finds shfmt

Make sure everyone working on the project has time to get familiar with new tools before introducing another one. Otherwise you risk becoming the “Keeper of the Tool”, leading a solitary life in a silo.

Don’t be afraid to revert or change tools. Sometimes the tool is too painful to work with. Many years ago I tried a Java formatter. But rather than sensible defaults, the first thing I had to do was to choose between a bunch of formatting standards, none of which I was familiar with because I was just getting started. I won’t name names, but I still run into this with “modern” tools, having to do a bunch of obscure configuration just to get started. Other times the community default changes or crystallises, such as nixfmt-rfc-style recently becoming the default Nix formatting tool.

Sometimes two tools overlap in functionality. For example, I’d recommend using isort with Black, even though we have to manually configure isort to use a style compatible with Black. Other times the tools refuse to work together, and we have to choose between them. 🤷

If a few hours with a tool doesn’t give much benefit, just stash that work. Maybe look into it again in a month or two, when the original experience has faded a bit. At the same time, we shouldn’t feel obliged to introduce all the tools we possibly can - some might just be more trouble than they are worth, or don’t fit how we want to work. For example, I like the idea of prose style checkers, but not of excising the word “is” from my blog!

The earlier in a project automation is introduced, the better. Introducing a formatter usually results in a single commit with a big diff, which can make it harder to explore the version control log. That said, if a project is in active development it’s probably going to last yet another long time, so we should try to judge fairly the cost of such a one-time big diff against the repeated return on investment from automation. In the golden words of Randall Munroe, Is It Worth the Time? (But also, please also consider the developer experience improvement! Time is not the only dimension worth optimising for.)

When working on a project with many developers, we need to be careful to make the introduction of a new tool as painless as possible. Which means we need to be prepared to learn the tool in some depth before even suggesting it for production use. We might have to showcase it, discuss any quirks (slow speed, bad defaults, workarounds for common issues), and create a plan for how to introduce it. This might involve setting up temporary logic to only apply the tool to new files, so that the team can get used to it before committing to the big diff resulting from applying it to the entire repository. Then we might apply the tool to all changed files. Make sure developers apply these changes in a separate commit or branch before the changes they are working on, so that it’s easy to review the formatting changes separate from any semantic changes. This should organically lead to a better state, and after a while we can introduce a single commit (or small series of commits) to finish the job and tear down any temporary code.

Which tools?

This isn’t really the best place to go into any depth (future articles, perhaps), but I’d recommend these tools to anyone who wants their team to be able to concentrate on the important parts of the work:

pre-commit can be used in CI to run all linters and formatters on the entire repo with a single command, pre-commit run --all-files. Locally, pre-commit install will set up hooks to run only on the changed files when committing, to fix issues quickly before committing. All of the following tools work with pre-commit; see my pre-commit configuration for some examples.
pre-commit-hooks also has a grab-bag of useful hooks.
EditorConfig is a simple way to tell all modern editors how to do the basics.
check-jsonschema can verify conformance with both JSON and YAML schemas out of the box.
gitlint checks that commit messages conform to your requirements.
Prettier formats not just code, but also data files and markup.
Vale checks prose rules.