tl;dr: Being open about what you’re open about is helpful to everyone involved, including maintainers.
Technically, very little is required to call something “free/open source
software”. The developer community consensus1 seems to require
exactly two things: the source code and a license permitting sharing and
modification, together somewhere online. Back in the days of source tarballs,
./configure && make && make install, that was pretty much the norm.
However, there are a lot of other ways a project can be open than just providing source code and a license. Let’s look at a few.
Before you shoot me an angry email, I’m not saying every project should have all the features below. Some of them are overkill for small projects. Some of them involve a lot of work to set up, maintain, and interact with the users. In other cases it might be policy not to spend time interacting with the community. And having a badly implemented or zombie feature, like all those abandoned user forums out there, is worse than not having it in the first place. Make sure you have the necessary resources and buy-in for at least the medium term if you’re going to go beyond the basics.
Let’s start with the absolutely base criteria: the source code and license can be downloaded from anywhere and extracted with extremely basic tools. Actually, hold on. A lot of projects fail even these entry-level criteria:
- Some projects have no explicit license. In some jurisdictions this could make life difficult for both the original author and contributors.
- The source is hard to get to. Maybe it’s on an unreliable host or only available via an unreliable connection. Maybe the provider requires you to register before giving access. Such a registration process probably requires an email address, which needs to be verified, which means you have to provide a valid email address, their server has to accept it2, they need to actually send you a verification email, your email provider has to accept the email, you have to react to the email within the time limit, and the system has to actually work when you click the link or respond to verify. Finally, some providers require you to explicitly accept the license before letting you download.
- The download is corrupted in some way. This is not something that only happens in oppressive regimes. SourceForge did it3, and at that point they were still a big provider of open source software. Scratch that, a lot of projects are still on SourceForge4, despite this appalling behaviour. Other package providers still support plain HTTP or even FTP, and FTP must die! Anyone between you and the server could corrupt the download, for fun, profit, or worse.
- Some projects use uncommon or proprietary tools to wrap their source code releases. Ever heard of WinRAR? It uses a proprietary compression algorithm. The decompression source code is open, but you’re not allowed to build a tool to produce RAR files.
I’m not saying the problems above are common. Thankfully not, but they do exist. The fact that they are dying out means the world has moved on. But we can do so much better than this.
An obvious step up from a tarball is a complete, public, and detailed commit log. Potential and actual users can learn a lot from looking through the logs:
- How old is the project?
- Who was involved at what time?
- Is the project is still active?
- What were the big pivot points?
- How much do the authors care about various aspects of the code, like speed, security, UX, maintainability, etc?
Can your users figure this out from the publicly available commit log?
Having publicly visible issues is probably the most obvious and ubiquitous open feature of modern repositories. The days when you’d need to scour a barely searchable email archive for duplicates are almost gone, and good riddance. It’s difficult enough to search even a good issue tracker, without having to simultaneously search through support requests.
(But wait, a lot of projects do use issues for support requests. Welp. That’s another one for the list: having somewhere separate from your work log where people can ask questions (and expect answers within a few days) about your software is really useful. Tags can help, but since every project implements and uses tags in different ways, it is far from obvious how to search through the right subset. In many cases it might be enough to simply tell people about Stack Overflow (or the relevant Stack Exchange site), and, if you’re keen, to monitor the relevant tag feed. Zero maintenance, and the community might even do a lot of the work.
How do you know when you’ve got this right? The vast majority of the entries in your issue tracker are at least actionable - they describe a specific problem which is solvable.)
Back to issues, it’s important for community building that they actually include all your (past, current, and future) work. If you hide away important tickets in some private system you’re sabotaging your chances of having healthy community interactions. That’s not to say every single thought or utterance about the project should be public. A good candidate for a publicly visible ticket might be something you intend to do in the near future, as opposed to just a weakly held opinion or far future vision of ultimate possibility.
On a similar note, any documentation which would be useful to new developers should probably be public. Architecture diagrams, engineering decisions, code review guidelines, release processes, development setup instructions, you name it. The gold standard here is that a new developer who is familiar with the technology but not this particular project should be able to start contributing within hours, not weeks.
It’s still not all that common to let end users see your full pipeline configuration and run logs. In some cases there might be legitimate security reasons for this, but it’s easy to see why it could be the opposite. If you don’t trust developers to keep secrets out of public configuration and logs, why should users expect that they understand security?
Some CI/CD systems make it almost impossible to be open in a useful fashion. Jenkins and other GUI-configured systems are the worst offenders. To independently reproduce a Jenkins pipeline, you basically have to install the same version as the project is using, then manually configure it the same way. GitHub is better: at least you can fork a repo on GitHub and run the same jobs. But that’s the rub: you can’t run a GitHub pipeline on any other platform, including locally. There are third party solutions, but like any third party solution they’ll at best be playing catch-up with the closed source GitHub implementation. GitLab is better than GitHub in this respect: a lot of the configuration can be copied and pasted locally or into another CI/CD system configuration with minimal work. Probably the worst part, which won’t apply to a lot of open source projects, is having to chase down and set up missing secrets, which can be time-consuming whack-a-mole unless the commands explicitly call out any missing variables.
Which other openness features would you want to see more of?
Most projects do a lot of the above, but it’s not obvious from looking at the front page. Why not mention what you do to be open front-and-center, in your project readme? Or if you have reasons not to be open in some way, why not just say so explicitly? If you don’t intend to fix issues, why not disable that feature? Let anyone interested know what they can expect, and don’t be surprised if it helps build a healthy community.
As opposed to the Open Source Initiative or GNU definitions. I’m not a lawyer, so let’s just leave it at that. ↩
Throwaway email addresses are nowhere near as ubiquitous or simple to set up as they should be, and the website might refuse such addresses. ↩
I don’t know whether the SourceForge debacle only involved binary release files, but once they’ve seen fit to wrap installers in adware, what reason do you have to trust that they won’t modify source downloads? ↩