Adaptive Query Optimizer for MariaDB Vector – Innovation Winner of MariaDB Python Hackathon 2025

We recently announced the winners  of the MariaDB Python Hackathon. We sat down with the Innovation track first place winners to learn more about the team and their submission.

Aakanksha Singh and Mihir Phalke developed an Adaptive Query Optimizer for MariaDB Vector, addressing performance challenges in vector similarity search operations. They were interviewed by Robert Silén, Community Advocate and Kaj Arnö, Executive Chairman of MariaDB Foundation. For the recorded interview, watch it on Youtube, or read the interview below.

Introducing Mihir and Aakanksha

Mihir: My name is Mihir Phalke and alongside me is Aakanksha Singh.

How deep can a bug be?

Last year I filed a bug report MDEV-33603 on what a looked like a benign problem with an optimizer taking a different code path in a particular trivial looking test. Its benign looking nature lead to me not looking at it until last week. The “benign” bug as it turned out is a bug in an OpenSSL optimization on IBM POWER, which maybe not the lowest level of “How deep”, but its certainly a long way from the high level (above storage engines) optimizer decisions in MariaDB.

Image by: Valerie Hinojosa on WikiMedia Commons

I feel I need to start this story justifying why it was left so long.

MariaDB is part of Google Summer of Code 2023

We are excited to announce that this year MariaDB has once again been accepted as a Google Summer of Code organization. With this blog post I want to showcase the projects we’re taking on and wish good luck to our mentees for the summer!

At MariaDB we strongly believe in growing Open Source and we encourage new developers to contribute. Google Summer of Code allows us to have dedicated contributors focus on a project for a few months, knowing the costs are covered. We at MariaDB can then just focus on the core aspects – writing code and growing our community.

10.7 preview feature JSON Histograms

MariaDB has had support for histograms as part of Engine Independent Table Statistics since 10.0. As part of Google Summer of Code (MDEV-21130), Michael Okoko, together with his mentor Sergey Petrunia, have implemented a new format (using JSON) for histograms that significantly improves the accuracy and flexibility of histograms. For those just interested in the feature details, you can skip to the “New format”, however if one is unfamiliar with the purpose of histograms, read on.

Why statistics are needed

Histograms are important for queries where the WHERE clause uses columns that are not indexed.