Towards a healthy ecosystem
A healthy ecosystem around MariaDB Server involves an active community. Lots of happy code contributors cause fast development of new functionality, as well as increased adoption by users. Users see the vibrancy of the contributor space as a sign of health, rightfully so. Hence, preventive health care “with daily exercise and good eating habits” is high on the agenda of MariaDB Foundation.
But in practice, improving MariaDB’s habits around code development is about as easy as improving individual life habits in general, particularly if you are under public scrutiny. Let me here share a few thoughts on our progress, and solicit some input.
The background: Openness, Amazon and MariaDB plc
First, MariaDB Foundation has openness as one of its three key values in our mission statement. It’s also a commonly accepted value, which in principle has all of us working on MariaDB Server pulling in the same direction. That’s a good thing, but openness is also a fluffy word that people put different meanings into.
Second, Amazon AWS is a much appreciated newcomer into the MariaDB developer ecosystem. Ok, “newcomer” is a relative concept, since AWS has been actively contributing well over a year – as witnessed by our Chief Contribution Officer Andrew Hutchings’ quarterly overviews.
Third, MariaDB plc remains the largest code contributor, measured in nearly any dimension – commits, lines of code or number of committers. Well over 90% of the code committed to MariaDB Server in the last five years has come from MariaDB plc, often from people who have been with the project for over a decade.
Different expectations
Setting the stage: MariaDB Foundation is caring for an ecosystem with many code contributors. One major player (MariaDB plc) in principle supports openness but established its working practices during pre-GitHub times, and another major player (Amazon AWS) is in principle aware of the long roots of the old habits of the project but expects swift progress towards modern best Open Source practices – centred around GitHub.
The good part is that I see no conflict of interest here, just different expectations.
MariaDB Foundation’s role
MariaDB Foundation holds the keys to the MariaDB Server code repository and sets the rules of engagement for the community. However, there are just ten of us and we don’t have sufficient expertise in the code base to review and merge all contributions without the help of MariaDB plc employees. This means we are more mediators and coaches than we are regulators or judges.
As the CEO of MariaDB Foundation, I’m trying to represent contributor interests towards two types of cooperation partners within MariaDB plc. Somewhat simplifying, it’s the Managers and the Developers. I am trying to persuade the Managers to allocate resources and create practices that are contributor friendly, and I am trying to make it easy and rewarding for the Developers to work with contributors.
Contributor expectations
Contributors not just in Amazon expect to be able to work with GitHub pull requests. They expect timely reviews, well documented on the Pull Request level. They also expect to be able to help out themselves by reviewing other people’s code. They expect the Buildbot tree to be green, meaning for the code base to be clean of code building bugs occurring on any platform, not just the platform they’re themselves developing on. They would like acknowledgement in the code base, i.e. code written by a contributor to be attributed to them.
Overall, code contributors from outside MariaDB plc expect to be treated similarly to code contributors from inside MariaDB plc.
As such, MariaDB Foundation is aligned with these contributor expectations. The expectations are reasonable.
Reviewer expectations
Reviewers have somewhat different expectations, but these are also reasonable. They want an efficient process, with little overhead. Historically, they have committed code directly to the development tree, with commits in certain cases being accepted even without a single review. They are happy to see Pull Requests being used, as long as they are not mandatory. They too expect the Buildbot tree to be green, but they may at times push code into a non-green tree. They are happy to review code, but have their own development deadlines set by their managers. And in the review process, they may find that only part of a Pull Request can be merged, causing problems in proper code attribution.
Overall, long time reviewers have their own claim on development culture and process. The choice to use pull requests is theirs. We can encourage it in a patient and understanding way, by showing the value of community review and by technical gains like auto-merge.
MariaDB Foundation expectations
We at MariaDB Foundations are in-between.
Overall, the reviewers who work for MariaDB Foundation are happy to work with exactly the same processes as external contributors, because we see value in making those processes as efficient as possible. We are happy to eat our own dog food, because we are incentivised to make sure it is tasty enough for ourselves.
In summary: We want contributors, and we want to make it meaningful as well as easy to contribute to MariaDB Server.
Forgive my arrogance, but MariaDB is different
At the same time, we have considerable understanding for some conservatism on the side of the reviewers.
Let me put it this way: MariaDB Server is a core part of the infrastructure of the Internet and the entire IT world.
Like many core parts of IT infrastructure, we struggle to get people willing to assist in testing our server, pre-release, where the possible defects can be detected and corrected before ending up on production.
Like PostgreSQL, MariaDB can be compared with, say, the Linux kernel or gcc. It shouldn’t be compared with Angular, Vue, or React, nor with any “young” Open Source project that does weekly releases and drops features left and right.
Backward compatibility is of utmost importance to MariaDB Server, and it doesn’t come for free. There are systems that are almost always breaking upon upgrades; we pride ourselves on avoiding such issues.
This means that our cost for accepting bad code is significantly higher than for a less fundamental project. We have to choose appropriate role models.
Finding a balance
Mind you, we have already taken several steps towards meeting the expectations of “modern” contributors. We do take Pull Requests (even if they aren’t mandatory). Pull requests are the chosen method for large pockets of users also within MariaDB plc (but not all). Review comments are left publicly (even if sometimes in the commit messages, not Pull Requests). We use Buildbot for CI (even if the tree isn’t always green).
It is true that we’re not completely on GitHub, as our feature and bug tracking is on Jira. It is true that some reviews are made face-to-face, over the phone, or with screen sharing – with comments in some cases not being made persistent and visible for all.
But we are in the process of taking further steps:
First, we are making an effort to clean up the red Buildbot tree to be green, and to establish processes such that it remains green. This should simplify life for all, and over time enable things such as automatically merging code that has passed reviews according to settings to be defined in GitHub.
Second, we have set the goal of making commit messages more descriptive. This should help all individual contributors learn from each other.
Third, we are figuring out the processes to be able to have an intermediate level of reviewers, who provide suggestions, rather than the “final judgement” upon committing. This should over time grow the pool of full reviewers.
Best practices work both ways. The practices need to be good for the elders as well as for the new contributors. MariaDB Server is not your run-of-the-mill Open Source project, so developing the practices will take time. But we’re on it.
And there are the devils in the details
The above means we have to proceed with good judgement. On top of this, there are a number of peculiarities and details.
The trickiest detail is with code attribution. I mentioned that a PR may consist of a mergeable part, and a not-yet-mergeable part. What freedoms does the contributor want to give the reviewer, when it comes to splitting the PR into chunks? If a reviewer can take a piece of great contributor PR and use it in the reviewer’s own PR (or commit), is it OK to give attribution merely in the form of a “co-author” tag on GitHub?
Also, what should happen with edits done by the reviewer? Should the PR first be accepted (without reviewer edits), followed by another push containing the reviewer edits? Should the reviewer edits be entered into the original PR, which creates a bit of unclarity as to whose the code is? Or should the reviewer bounce back the PR to the contributor with reviewing comments, until the PR is returned by the contributor in such a state that it can be pushed as-is? Here, swift progress and code attribution are at odds.
Things may be further complicated by a requirement on the code to be always green even in intermediate stages.
And next?
Luckily, we don’t need to be reinventing a lot of wheels. We need to find the right role models, and apply them to our situation. Suggestions are welcome, not just amongst fellow GitHub projects.
All in all, we believe we’re progressing towards more healthy practices. We ask both our contributors and our reviewers to be patient and help us improve practices, for the benefit of all.