Author Archives: Sergei
This is the last post in the “MariaDB Vector: How it works” series. The first three were about storage, in-memory representation, HNSW modifications. Everything that was done in MariaDB 11.8. This post talks about new feature in MariaDB 12.3: optimized distance calculation.
As I mentioned earlier, distance calculation is the most time consuming part of the vector search, taking 80–90% of the total search time. Also it is linear on the number of dimension — computing the distance between vectors of 1536 dimensions takes twice as long compared to vectors of 768 dimensions.
…
In the previous parts of this series we’ve seen how MariaDB stores vector indexes in a table and how to implement HNSW for a good performance. But MariaDB is not implementing HNSW, it calls its vector search algorithm mHNWS, a modified HNSW. Let’s see how exactly it was modified.
Not so greedy!
HWNS, like many, if not most, graph based vector search algorithms is greedy. Think of it this way, when it needs to find just one nearest vector (ef=1), it will walk the graph always choosing the node that will take it the closest to the target at this particular step.
…
In the first post of this series, I’ve described how the vector index is stored in a table and how it achieves full transactional behavior and ACID properties compatible with the storage engine of the table the user created. But while the table provides persistent storage of the index, it’s in-memory part that gives it the performance. This is how it works.
Distance calculations
This is the most performance sensitive part of the HNSW. According to various estimates, distance calculations account for 80–90% of search time. And this operation time grows linearly with the vector length.
…
You might have seen that MariaDB Vector is fast. And is getting faster. But why? How does it achieve that? And why it is said to use mHNSW (modified HNSW) algorithm? What did it modify in the conventional HNSW that all other databases are using? Let’s take it apart and analyze piece by piece.
Introduction into HNSW
This post is not a full description of HNSW, there are many HNSW descriptions online and they are good, better than what I could’ve written. I will only show the basic concepts beyond HNSW, concepts that are crucial for the rest of the post.
…
I have benchmarked MariaDB Vector before, but it was a while ago. Users kept asking about Milvus. New pgvector alternatives were gaining popularity. And I simply wanted to see if MariaDB got any better. This benchmark round includes more databases, larger dataset, and no irrelevant datasets that only add noise but don’t really help today in 2026.
Dataset
Now is the AI time. Vector search is used for embeddings generated by LLMs. Most ann-benchmarks datasets are pre-AI and use, for example, image transformations and filters to construct vectors. While useful for certain purposes, they are not the main use case for the MariaDB Vector and providing these results would be misleading and distracting from what matters to users.
…
Continue reading “Big Vector Search Benchmark: 10 databases comparison”
We are pleased to announce the availability of a preview of the MariaDB 12.3 series. MariaDB 12.3 will be a long-term release.
MariaDB 12.3 introduces numerous new features, in particular
Compatibility features
- Oracle TO_DATE() function (MDEV-19683)
- Support for cursors on prepared statements (MDEV-33830)
- SQL Standard SET PATH statement (MDEV-34391)
- SQL Standard Global Temporary Tables (MDEV-35915)
- SQL Standard IS JSON predicate (MDEV-37072)
- Allow UPDATE/DELETE to read from a CTE (MDEV-37220)
- XMLTYPE data type (MDEV-37261)
Replication
- Configurable defaults for MASTER_SSL_* settings for CHANGE MASTER (MDEV-28302)
- Fragment ROW replication events larger than max_packet_size (MDEV-32570)
- Improving performance of binary logging by removing the need of syncing it (MDEV-34705)
Miscellaneous
- New hash algorithms for PARTITION BY KEY (MDEV-9826)
- Hashicorp Plugin: Implement cache flush for forced key rotation (MDEV-30847)
- Optimise reorderable LEFT JOINs (MDEV-36055)
Thanks, and enjoy MariaDB!
…
We are pleased to announce the availability of a preview of the MariaDB 12.2 series. MariaDB 12.2 will be a rolling release.
MariaDB 12.2 introduces numerous new features, in particular
Compatibility features
- TO_NUMBER() function (MDEV-20022)
- TRUNC() function (MDEV-20023)
- Global Temporary Tables (MDEV-35915)
- Optimizer hints: [NO_]ROWID_FILTER (MDEV-36089)
- Optimizer hints: [NO_]INDEX_MERGE (MDEV-36125)
- INFORMATION_SCHEMA table TRIGGERED_UPDATE_COLUMNS (MDEV-36996)
- Optimizer hints support for named query blocks and views (MDEV-37511)
Miscellaneous
- INFORMATION_SCHEMA table PARAMETERS has a new column PARAMETER_DEFAULT (MDEV-37054)
- Improved support of replication between tables of different structure (MDEV-36290)
Thanks, and enjoy MariaDB!
…
We are pleased to announce the availability of a preview of the MariaDB 12.1 series. MariaDB 12.1 will be a rolling release.
MariaDB 12.1 introduces numerous new features, in particular
Performance improvements
- Segmented key cache for Aria (MDEV-24)
- MDL scalability improvements (MDEV-19749)
- Parallel replication for galera replicas (MDEV-20065)
- Buffered logging for audit plugin (MDEV-34680)
- Faster vector distance calculations via extrapolation (MDEV-36205)
Compatibility features
- caching_sha2_password plugin (MDEV-9804)
- ( + ) for outer join syntax (MDEV-13817)
- rpl_semi_sync_master_wait_for_slave_count (MDEV-18983)
- Associative arrays: DECLARE TYPE ..
…