MariaDB Vector: How it works

You might have seen that MariaDB Vector is fast. And is getting faster. But why? How does it achieve that? And why it is said to use mHNSW (modified HNSW) algorithm? What did it modify in the conventional HNSW that all other databases are using? Let’s take it apart and analyze piece by piece.

Introduction into HNSW

This post is not a full description of HNSW, there are many HNSW descriptions online and they are good, better than what I could’ve written. I will only show the basic concepts beyond HNSW, concepts that are crucial for the rest of the post.

Vectors

A vector is a list of floating point numbers. In 2D a vector is a pair of (x,y) coordinates like (1, 3) or (2.5, -0.15), a point on a coordinate plane. In 1536D it is a longer list and no human can visualize that, but it’s still a list of floating point numbers.

Neighbors

HNSW is essentially a graph algorithm. All vectors form a graph, with vectors being graph nodes, and they are linked by graph edges to other vectors. Something a little bit like

The search begins from some “starting” vector and then walks these blue lines until it gets to vectors that are close to the vector we are searching for. What’s important for us here, that every vector has neighbors — vectors it is connected with. That is, an index stores not just vectors — for every vector it also needs to store the list of its neighbors.

If you want to know more about HNSW, for example, why it’s called “hierarchical” or how to build the HNSW graph, see the original paper or, for example, the excellent Pinecone blog post.

Storage

HNSW is an in-memory algorithm, if one wants to store the index persistently, this should be implemented separately. There are many ways to do it. An easiest one, perhaps, would be to use an existing hnsw library to build the graph and then dump it to file as one big blob. This means no ACID whatsoever, all transactions modify the same graphs and see each other even uncommitted changes, and any crash means the index needs to be rebuilt from scratch, which can take days. There are projects that took this approach.

On the other end of the spectrum there is pgvector. It carefully lays out individual tuples on pages, writes WAL records, handles vacuum. This adds a ton of complexity compared to the first approach, but allows to provide full ACID capabilities that a user naturally expects from an index in a mature DBMS such as PostgreSQL.

MariaDB, as we all know, uses pluggable storage engines. In the “pgvector” approach, vector index would need to know how to handle page layout in every such engine separately (and MyISAM, for example, does not even use pages in the data file). Even if we’d limit the implementation to InnoDB, it would have been a lot more complex than in PostgreSQL because InnoDB does not do copy-on-write — which simplifies maintenance, there is no VACUUM, but makes adding a new page format much more difficult. Still, I believe some forks took this approach.

Anyway, MariaDB implements a middle ground solution, a trade-off. For every vector index it creates a shadow table invisible to the user. The table has the structure

CREATE TABLE (
  layer tinyint not null,
  tref varbinary(N),
  vec blob not null,
  neighbors blob not null,
  unique (tref),
  key (layer))

This table technically has no name, because it is not user visible and cannot be accessed directly. tref stores a “reference” to the original row in the user table, a row id, which can be a primary key value or an internal row id and its length depends on the row id length for a particular user table. The vector itself and neighbors are packed into two blobs.

This architecture of “high-level indexes” was inspired by InnoDB implementation of full-text indexes and it allows to add new indexes quickly in a storage engine independent way. This index naturally exhibits all ACID properties that the engine supports, it is part of the same transaction, has same transaction isolation rules, support crash recovery, locking, everything. The in-memory graph becomes merely a cache of the index. Vectors are read from the table and added to the in-memory graph on demand and while the whole index must be cached for the optimal performance, the search will still work correctly even if the index is not in the cache.

This design is somewhat more expensive compared to a dedicated vector-specific page layout. On the other hand it is much more simple to implement and to get right — it basically guarantees no transaction isolation or recovery bugs that would’ve been notoriously difficult to discover. And in this particular case the storage efficiency does not matter much anyway — I once tried to disable all disk I/O completely and the vector search got about 30% faster. Meaning, no matter how efficient and perfect the storage code and the page layout will be, it will never improve the performance more than 1.3x, likely much less. Compare that with 4x-10x speedup from algorithmic improvements.

To be continued

This post is getting too long already, let’s have a break. The second part will talk about distance calculations and in-memory representations of vectors — stuff that are responsible for the most of the MariaDB Vectors speed.