What we’ve developed in 2024

As Chief Development Officer of the MariaDB Foundation, I’ve worked to ensure that our development efforts focus where they matter most. On this final day of 2024, I want to reflect on the significant technical achievements we’ve accomplished and the collaborative processes that made them possible.

Our work this year has been driven by the goal of building a stronger, more engaged MariaDB community. By sharing our progress and learnings, I hope to provide insights that may inspire and support other open-source projects.

Finally, I’ll outline the Foundation’s vision for 2025 and how we plan to bring it to life.

Buildbot: Our CI/CD Pipeline Is a Core Pillar of Development

The first key project I want to highlight is the CI/CD pipeline for MariaDB Server, Galera and related ecosystem software such as WordPress, PHP and others. This pipeline resides at https://buildbot.mariadb.org and is what core developers use on a daily basis.

For those who may not know, whenever a developer pushes a change to MariaDB Server’s Github Repo or creates a pull request, that commit is tested on more than 100 different test & architecture combinations. We have to do this because MariaDB Server is what I call “foundational software”. We have a responsibility to our users to offer a stable database. And since we are nearing 3 billion downloads on dockerhub alone, with such a large userbase, we can not afford to have regressions.

We have a set of core test combinations (we call these protected builders) that enforce branch protection. One can not merge any commit into the MariaDB Server repositories unless these tests pass, no exceptions. For any project looking to improve their code quality, I recommend checking out branch protection and status checks on Github.

We’ve also started experimenting with GitHub’s Auto-Merge feature. So far, the results have been promising, though there are occasional hiccups where some PRs don’t auto-merge even if all tests pass. These cases, however, are rare.

We’re making strides to make contributors’ lives easier. By creating a pull request, contributors see their patches tested against our comprehensive CI/CD suite. As builders and their tests mature, we add them to this pipeline to provide broader coverage while minimizing false positives for new contributors.

Internally, we’ve done quite a few things:

  • Migrated the entire deployment to a Docker Compose stack. This streamlines Buildbot updates and includes our CrossReference plugin for tracking test failures. You can find the CrossReference plugin code on Github as well.
  • Migrated the master machine to newer hardware, with more storage space. This will allow us to store more build artifacts and help developers track down sporadic test failures.
  • Added a dedicated MinIO instance for S3 storage testing.
  • Ported PAM authentication tests from the legacy CI/CD system.
  • Leveraged Github actions to build our testing images in a development environment. With the help of container tags, we can move them to production instantly. All we do is tag the dev image to its production equivalent. For example ubuntu2404_dev to ubuntu2404. This has decreased our downtime quite a lot, particularly when distributions get bigger updates.
  • MariaDB official docker images are now available under a Red Hat based system, see Daniel’s post for more details.
Community pull request to MariaDB Server shows status checks passing.

MariaDB Vector: A Milestone in Analytical Capabilities

It took significant effort, but MariaDB Vector is now part of a stable release (11.7). While there are numerous blog posts exploring its functionalitycomplete with RAG (retrieval-augmented generation) examples, performance benchmarks, and framework integrations – here I want to focus on the community-driven approach behind its development.

  1. Collaboration from the start: We established a project page outlining governance, making it clear that external contributions were welcome and the final syntax would be a community effort.
  2. Industry and academic research: We reviewed existing implementations and the latest research to avoid pitfalls and adopt best practices.
  3. Task management and scope: We outlined the project in JIRA, helping us to track and share progress, refine the specification, and distribute work across developers.
  4. Engaging contributors: We reached out to key players like PlaneScale, Google, and Amazon. Amazon, in particular, contributed performance benchmarks and critical bug fixes while working closely with us through periodic meetings and emails.
  5. Community feedback loop: Once the preview was ready, we engaged the community for testing and feedback. This interaction, including collaboration with SkySQL, led to bug reports and integrations like LLamaIndex.
  6. Wider outreach: Presentations at conferences such as FrosCon, Open Source India, and our Berlin ServerFest provided valuable feedback, which now shapes future development plans.
  7. Bounty program: We launched a “Vector bounty” program to encourage more integrations between MariaDB Vector and various AI frameworks.

Catalogs: Multi-tenancy in MariaDB Server

The Catalogs project, started two years ago, made significant strides in 2024. Catalogs aim to enable more cost-effective and energy-efficient MariaDB Server deployments by consolidating multiple tenants within a single server.

While the feature is still in a separate repository, we’re optimistic about a preview release soon. Key milestones:

This is our first project ever funded through structural funds. NlNet is funding a portion of the development. We hope to have many like this one in the future.

Looking Ahead: 2025 and Beyond

As we wrap up 2024, we’re thrilled about what’s to come. MariaDB’s role in supporting AI-driven applications and delivering high-performance, scalable solutions for our community is clearer than ever.

In 2025, we will focus on the following key areas:

  • Expand MariaDB Vector’s capabilities to stay ahead of the AI wave. This means making sure MariaDB is supported in as many AI frameworks as possible. We will listen to community feedback and improve our vector search even more. There are many indexing algorithms still to explore.
  • Enhance our CI/CD infrastructure further to support faster, more reliable development. We will focus on development processes, streamline the workflow of core developers and enable community contributors to benefit from the same comprehensive testing. We’ll also focus on automating our releases as much as possible so we can reduce human error and allow those resources to be better spent elsewhere.
  • Deliver Catalogs as a fully complete feature to MariaDB Server. This one is pretty straightforward. We started a project and we intend to finish. Community contributors are welcome to start experimenting and building tools around catalogs. If you have a feature you need to make it better, let’s talk!
  • Foster greater collaboration within the open-source ecosystem, encouraging more community contributions. We learned there are many areas around contributing to MariaDB that can be improved and this is what we’ll do this year. The focus will be on timely community contribution reviews, better documentation, easier to find bugs for new contributors to fix as well as more conference appearances.

MariaDB Foundation is committed to transparency, innovation, and empowering our vibrant community. To everyone who contributed, tested, or even just provided feedback this year—thank you. Together, we’re building something remarkable.

Here’s to an even more successful 2025!