DuckDB Storage Engine for MariaDB. When the Sea Lion Learns to Quack.

An early look at the DuckDB storage engine for MariaDB — columnar, vectorized analytics that live right next to your transactional tables.

The problem

MariaDB’s InnoDB is excellent at what it was built for: transactions. Row-by-row inserts, updates, point lookups, strong consistency. But the moment you ask it to scan tens of millions of rows for a multi-way join with a few aggregations, a row store has to work hard.

The usual answer is to stand up a separate analytical system, then build ETL pipelines to copy data into it.

The Power Of The Community!

Inspired by some recent LinkedIn posts, I decided to take the AI in my own hands and do some stats on the MariaDB and MySQL repositories.

This graph is what I’ve got.

Not only have MariaDB Server distinct contributors surpassed the distinct MySQL Server contributors count! The External MariaDB contributors alone did! *

This is how the Power Of the Community looks like!

  • You get to use a more functional, performant and error free MariaDB Server
  • ⁠⁠You get a say in shaping the future of the MariaDB Server.

MariaDB Foundation: Bringing TPC-B Back To Life

When I joined Pervasive PSQL, one of the first performance test cases I was introduced to was TPC-B. It was already implemented inside Pervasive PSQL and it quickly became one of the most important tools in my daily work. A little later, another developer and I wrote the Pervasive PSQL’s TPC-C implementation in C++. Between those two workloads, and a few others, I spent nearly five years performing change testing. (https://en.wikipedia.org/wiki/Pervasive_Software)

During that time, about seventy percent of all regressions came from TPC-B, with the remaining issues coming from ATOMICs and TPC-C.

Documented: The MariaDB Server (Community) Contribution Process

If you ever considered contributing code to the MariaDB server, you should know that this is an intricate process involving multiple steps and multiple actors. To help you see your contributions successfully merged into the MariaDB Server codebase I’ve compiled a comprehensive description of the contribution process itself, the roles involved into it, the sequence of actions and conditions for transition from one to another. There’s even a diagram!

Please go to COMMUNITY_CONTRIBUTIONS.md.

This of course is going to be a moving target! I fully intend to keep the document up to date and enhance it with clarifications and process changes as they happen.

MariaDB Vector: How it works. Part III

In the previous parts of this series we’ve seen how MariaDB stores vector indexes in a table and how to implement HNSW for a good performance. But MariaDB is not implementing HNSW, it calls its vector search algorithm mHNWS, a modified HNSW. Let’s see how exactly it was modified.

Not so greedy!

HWNS, like many, if not most, graph based vector search algorithms is greedy. Think of it this way, when it needs to find just one nearest vector (ef=1), it will walk the graph always choosing the node that will take it the closest to the target at this particular step.

MariaDB Vector: How it works. Part II

In the first post of this series, I’ve described how the vector index is stored in a table and how it achieves full transactional behavior and ACID properties compatible with the storage engine of the table the user created. But while the table provides persistent storage of the index, it’s in-memory part that gives it the performance. This is how it works.

Distance calculations

This is the most performance sensitive part of the HNSW. According to various estimates, distance calculations account for 80–90% of search time. And this operation time grows linearly with the vector length.

MariaDB 13.0 Preview Now Available

We are pleased to announce the availability of a preview of the MariaDB 13.0 series. MariaDB 13.0 is a preview rolling release, published on 23 March 2026, and it continues the work started in 12.3 while adding a solid set of entirely new features.

And this one is interesting.

This preview release brings a nice mix of new SQL capabilities, better optimizer insight, richer metadata, and practical engine improvements. Not every feature is flashy, but many of them are exactly the kind of changes that make daily work with MariaDB smoother, clearer, and just a bit more powerful.

MariaDB Vector: How it works

You might have seen that MariaDB Vector is fast. And is getting faster. But why? How does it achieve that? And why it is said to use mHNSW (modified HNSW) algorithm? What did it modify in the conventional HNSW that all other databases are using? Let’s take it apart and analyze piece by piece.

Introduction into HNSW

This post is not a full description of HNSW, there are many HNSW descriptions online and they are good, better than what I could’ve written. I will only show the basic concepts beyond HNSW, concepts that are crucial for the rest of the post.