The Onion framework – a new approach to Buildbot configuration

A framework that doesn’t make you cry.

Stay tuned to learn how you can get involved and contribute to the project!

If you don’t known what Buildbot is and what benefits it brings to MariaDB please check out this this quick introduction.

🤔 Just, why?

Over the years, the Buildbot project within MariaDB has undergone numerous transformations, constantly shaped by the requirements it needed to meet.

When I joined MariaDB in 2024, I was initially a bit overwhelmed by the complexity of the codebase, a complexity that later turned out to stem mainly from the intertwining of implementation details with the simple core principles of BuildBot, such as builders, factories, builds, and workers.

As the implementation of new requirements became increasingly hacky, we realized it was time to put a stop to it and address the problem at its root. That’s how this framework came to life.

🎯 Core motivation

Beyond the initial motivation to create a framework that would make life easier for anyone wanting to contribute, we also set out to solve a problem that had become critical for our business needs: MDBF-989.

In short, using DockerLatentWorker allows you to install a buildbot-worker inside a container. You expose the Docker daemon over the network from a host, and Buildbot can then run workloads inside a container.

The challenge

Whenever you need to run a series of steps in a different image than the one you started with, you’re forced to do so in a separate build on a different builder, because changing the image within the same build is not possible.

In the image on the left, showing the initial approach, we need two separate builders to implement a feature that requires running certain steps inside two different Docker environments. It’s not practical to keep branching, as it becomes difficult for a user to track the progress of their server build.

If we need to share files between these two environments, we can’t rely on using the same host because Buildbot doesn’t guarantee that Builder 2 will run on the same host within the configured worker pool. Sharing artifacts requires a third party service, like a static file server (ci.mariadb.org) which can increase the build time if the network is slow.

Now

This leads us to the solution shown in the image on the right: here, the buildbot-worker is installed directly on the host, and the framework introduces, at its core, a container command wrapper. This wrapper ensures that commands can be executed inside a container whenever we instruct it to do so.

Moreover, a Docker volume mount is used to easily share the necessary artifacts between steps.

The core of the framework is responsible for managing the Docker environment through it’s lifetime so the buildbot contributor can focus on building business value. If you want to know how this is implemented see the InContainer class (command wrapper) and the Processor functions (managing the environment).

🗓️ Where We Stand Now

We are currently in the middle of migrating our builders to this new framework and defining new builder types directly within it. The most important thing is to ensure that we have the same builders as on buildbot.mariadb.net and that our test coverage is at least as good as in the old version.

That’s why the number of masters and configuration files in the MariaDB buildbot repository might still look overwhelming. In reality, many of them will gradually be removed and shouldn’t concern anyone interested in contributing. What we will discuss in “The Layers of Contribution” refers exclusively to this new framework.

Speaking of migrating builders, it comes with a significant infrastructure overhead, so I want to mention that this wouldn’t have been possible without the generosity of our hardware sponsors.
A heartfelt thank you for enabling us to maintain a fast and reliable CI tool! ❤️

🧅 The Layers of Contribution

The three layers you see in the image are based on the “need to know” principle. We’ve tried, as much as possible, to make the easiest changes on the surface also the most frequent ones.

Only two paths in the entire repository are used for configuring builders within the framework:

Level 1 – The master configuration

The master configuration has been kept as simple as possible, so that it provides a good overview of all defined builders and workers without cluttering it with implementation details.

Among the most common recurring operation, is adding a new version of an operating system for which we want to release MariaDB Server packages. This is a good starting point when learning the framework.

Adding a new builder

Scenario

  • Debian 13 is out and we want to release MariaDB server packages for it

Approach:

Buildbot holds its configuration in a dictionary named BuildmasterConfig. The builders list contains instances of BuilderConfig.

We will start with one architecture and append a new builder to c["builders"]

Questions we need to answer:

  1. Do we have a sequence that configures, compiles, tests and packages the server for debian?
  2. Do we have a containerized environment for building the server on debian 13?

To answer [1], you can either look at the many existing examples or read the module containing the functions that generate release sequences at release.py.

Based on that, the most appropriate function seems to be deb_autobake(), which has the following inputs:

  • the number of CPU’s (jobs) allocated for tasks that can run in parallel, e.g. make or mariadb-test
  • a DockerConfig object to define the docker environment i.e. the build environment
  • where to save the packages i.e. ARTIFACTS_URL
  • if you want to run some optional tests (S3, Galera, RocksDB).

For [2], you need to check whether a build environment for Debian 13 is configured here. If not, add it to the matrix and make sure that the associated Dockerfile contains all the necessary tools for compiling and testing the server. The build environment can be tested locally using docker build

# Append a Generic Builder instance to c['builders'] dictionary.
GenericBuilder(
    name="amd64-debian-13-deb-autobake",
    sequences=[
        deb_autobake(
            jobs=builder_jobs,
            config=DockerConfig(
                repository=os.environ["CONTAINER_REGISTRY_URL"],
                image_tag="debian13",
                workdir=PurePath("/home/buildbot"),
                bind_mounts=[
                    (f'{os.environ["MASTER_PACKAGES_DIR"]}/', "/packages"),
                    ("/srv/buildbot/ccache", "/mnt/ccache"),
                ],
                shm_size=shm_size,
                env_vars=[
                    ("ARTIFACTS_URL", os.environ["ARTIFACTS_URL"]),
                    ("CCACHE_DIR", "/mnt/ccache"),
                ],
                memlock_limit=memlock_limit,
            ),
            artifacts_url=os.environ["ARTIFACTS_URL"],
            test_galera=True,
            test_rocksdb=True,
            test_s3=True,
        ),
    ],
    ).get_config(
        workers=WORKER_POOL.get_workers_for_arch(arch=arch),
        next_build=nextBuild,
        can_start_build=canStartBuild,
        tags=[],
        jobs=builder_jobs,
        properties={
            "save_packages": True,
        },
    ),

Other things to watch out for:

  • WORKER_POOL.get_workers_for_arch allocates workers from the pool for a specific architecture. Make sure the architecture is correct.
  • Make sure JOBS is set to the number of CPUs you want for compilation. For a release builder, 7 is usually sufficient.
  • shm-size is the amount of shared memory allocated to the container at runtime. You can expect that tests run with the --mem option so this variable value should be at least jobs x 2.

The remaining step is to schedule the builder to run by adding it to the SUPPORTED_PLATFORMS dictionary.

In Buildbot, the server is built starting from the major version included in the default repositories of the distribution. Check which version of MariaDB is in Debian 13, and this will help you determine the key in SUPPORTED_PLATFORMS.

That’s it, your patch is ready for public review.

Level 2 – sequence configuration

All the modules representing sequence libraries are defined here, grouped by their main purpose. Think of the sequence end goal when deciding which module is more appropriate.
For example, release.py is a library of sequence-generating functions that produce artifacts which either contribute to the release process or will eventually become public as part of that process.


The basic structure of a sequence-generating function is:

def seq_func(param1,param2):
    sequence = BuildSequence()
    sequence.add_step(...)
    sequence.add_step(...)
    return sequence

Basically, you:

  • get a BuildSequence instance
  • add steps to it
  • return it for usage in master.cfg

Get to know your toolbelt

Starting from the simplest example, a command that prints environment details of the buildbot-worker host environment.

sequence.add_step(ShellStep(command=PrintEnvironmentDetails()))

A sequence requires that the first construct be provided with an implementation of a BaseStep as input.
In most cases, we work with ShellStep when we want to run a command on the host or inside a container.
It’s self-explanatory that a ShellStep accepts a Command implementation as input.

Check the docstring of each class to understand what it does and what arguments it accepts.

Resources:

A more complex example – Run in container

Running a command inside a container is fairly straightforward. We wrap the step using InContainer. It’s necessary to specify a Docker environment, meaning an instance of DockerConfig.

        sequence.add_step(
            InContainer(
                docker_environment=srpm_config,
                step=ShellStep(
                    command=SRPMCompare(
                        workdir=RPM_AUTOBAKE_BASE_WORKDIR,
                    ),
                    options=StepOptions(
                        doStepIf=SRPM_RUN_CONDITION,
                        description="SRPM - Compare",
                        descriptionDone="SRPM - Compare done",
                    ),
                ),
            )
        )

Step supports options that control its behavior at runtime or how it is displayed in the GUI.
Similarly, commands support common options, such as the working directory where the command is executed.

In practice, after loading the configuration, the sequence will appear as a series of IBuildStep, as in the example below:

I encourage you to explore the repository and read through each docstring to understand just how far you can go with configuration.

Level 3 – steps, commands, generators …

If you’ve made it this far, it means you have a rather unique requirement to implement.

There could be many examples here, but there’s no point in going into them. Instead, I’d prefer to provide just a few general guidelines, assuming that by this point you’re already quite familiar with the codebase.

  • When defining new commands: if another command already does something similar, try to see if you can generalize the problem and modify that command to handle multiple cases.
  • If a command has a large number of input options that can change its behavior, and these variations are frequently used across the project, then you should consider creating a generator. You can start by studying the MTR and CMAKE generators, and whatever you build should follow the foundation defined here.
  • All implementations of BaseStep must implement a generate() function that returns a concrete implementation of a buildbot step (IBuildStep). See the Build Steps documentation for details.
  • All implementations of Command must implement the as_cmd_arg function, typically returning a list of commands. This list is later joined into the final command, allowing InContainer wrapping of the command.

Glossary

  • Builder – the configuration that defines what and how to build; basically the build plan.
  • Build – an instance of a builder; represents the actual execution of a build.
  • Worker – the machine/agent that runs the build; executes the build steps.
  • Factory – the sequence of steps a builder uses to create a build; defines the exact procedure.
  • Sequence – like a Buildbot factory, but in a slightly more abstract way. It lets you chain together as many sequences as you want to compose the steps a build needs to execute.
  • Master Config – configuration file loaded by Buildbot at runtime
  • MTR (MariaDB Test Runner) – A command-line tool used for running the test suites included with MariaDB
  • Command generators: frequently used commands in the project with type safety included and their most frequently used options as dataclasses

👋 Wrapping Up

This was quite a long blog post, but I hope it gave you a solid introduction to how you can contribute to Buildbot within MariaDB.

I’ll leave you two examples of beginner-friendly tasks:

If you need our help to contribute, you can write to us on Zulip. Looking forward to hearing from you soon, maybe even in the Pull Requests section of the Buildbot repository!