Faux idempotence

Writing idempotent code is great. It should always result in the same final state, so it’s easy to test. And a failed run shouldn’t affect the next one, making it reliable and safe. But as you’ve noticed I used “should” in both places, because code which looks idempotent may not be.

Update: Several Hacker News commenters corrected my use of “idempotence” in this article. I’m sorry for using the word wrong. Please ignore the literal use of “idempotent”, and instead consider this a cautionary tale that

every operation in a script has to be idempotent for the script to be idempotent, and
even when a script is truly idempotent, Murphy’s law makes sure that almost every attempt at applying X twice instead ends up applying XYX, where Y is out of your control.

This article was inspired by How to write idempotent Bash scripts by Fatih Arslan. It is not at all my intention to dunk on that article in particular — there are lots of good tips in there — but rather to illustrate that actual idempotence is harder to achieve than the article purports, and we should be careful not to declare some code idempotent when it isn’t.

The first example of idempotent-but-not-really code in the article is touch example.txt. The author takes care to mention one way this is not idempotent: it updates the file’s modification time. But there is a more subtle way this is not idempotent, because it depends on state which you are generally not in complete control over. For example, if someone changes the access rights of the file so that you no longer have access to modify it, touch will fail:

$ cd "$(mktemp --directory)"
$ touch example.txt
$ sudo chown nobody example.txt
$ touch example.txt
touch: cannot touch 'example.txt': Permission denied

The obvious objection to this is that of course the root user could sabotage your process, because it has full system access. But the same would happen if the filesystem is mounted read-only between the first and the second touch, which can happen automatically, for example if the system detects any issues with the storage medium.

Another way this is not idempotent is that “create the file if it doesn’t exist” isn’t the same as “create the file if it doesn’t exist, or empty it if it does exist” (> example.txt). touch example.txt leaves the contents of example.txt alone, so if your previous run added some contents to it your system is now in a completely different state from the last time you ran touch example.txt. This is a common problem, and is easy to demonstrate:

touch example.txt
while some_command
do
    echo foo >> example.txt
done

It’s not idempotent, because every run adds more content to example.txt. As you can see, this means touch example.txt might be idempotent in at least one sense on its own, but that’s not usually what you care about. There’s not much value in individual commands being idempotent in the way you care about (file existence in this case), rather the entire process which could be restarted needs to be idempotent.

An example which keeps cropping up in test pipelines is that ideally you should be able to run as many pipelines as you like, simultaneously (for example, multiple branches) or repeatedly (for example, after a failure because of a resource out of your control). At the start of the project this usually works fine, but you might run into various issues:

A test system runs out of memory, disk space, inodes, or any other finite resource, not because of the code under test but because of all the resources used by other processes, previous and current. Unless you’re a sole developer in charge of your own test system there’s usually not much you can do about this once it becomes a problem. But at least this can be largely avoided by creating fast tests, small systems, and monitoring all sorts of finite resources during test runs to learn of anything about to run out.
Your cloud provider only allows you to create five frobnicators on your account, and each of your pipelines creates one frobnicator. Now you have to choose: ask your cloud provider to bump the limit (which will still be pretty low, and might cost extra), share frobnicators between runs (after which your tests are no longer independent), avoid creating frobnicators (usually not an option, could take a lot of effort redesigning your application, and might run into different limits), or just live with it, making sure never to run more than five pipelines simultaneously, possibly spending lots of time either implementing some sort of otherwise pointless limiter or re-running failed jobs.
You control frobnicator IDs, but frobnicator IDs are global per account¹. So now you need a naming scheme and some way to spread the word about the ID from the process generating the ID to the rest of the architecture.
Frobnicator IDs are global across accounts². Now you need to make sure to be even more clever about your IDs, to make sure nobody else, ever, across the whole provider, comes up with the same ID.
Frobnicator IDs are global in some way, but your customers need to know about them³. Now you might have an additional restriction that your production IDs should be memorable, so you might end up with a different scheme from your test systems (since you typically have N test systems per production system, so a human readable naming scheme won’t scale), and you need to make damn sure never to drop that name in case someone else scoops it up. At best, they’ll hold up your pipelines while you change the ID or negotiate the return with them. At worst, they use it to impersonate you⁴.

The upshot is that a process being “idempotent” comes with a huge caveat: unless you’re working on extremely high quality code (I’m talking SQLite or Mars rover, not 100% test coverage and linted) there are probably many ways in and circumstances under which your code isn’t idempotent, and as developers we should be honest with ourselves and stakeholders about the limitations of producing software with limited resources.

This applies to a bunch of AWS resource names. ↩
See for example AWS S3 bucket names. ↩
AWS S3 bucket names, role names, etc. ↩
See for example domain drop catching. ↩