Multimodal-Metadata-Hub for MariaDB Vector – Innovation 2nd place at MariaDB BangPypers Hackathon 2025
Part of blog series on MariaDB BangPypers Hackathon, September – November 2025. Summary: Winners announced at BangPypers meetup | Blogs: Announcement at BangPypers Meetup | Quality Idea Submissions | Ideation phase closed | Impressive stats | Top ideas | How to Succeed | MariaDB Hackathon AMA slides | Winner interviews: Apache Airflow Integration #1 | Dagster Integration #2 | Query Optimizer for MariaDB Vector, Innovation #1 | Multimodal Embeddings for MariaDB Vector, Innovation #2 | Platform: mariadb-python.hackerearth.com | For more hackathons: MariaDB’s hackathon project page
We recently announced the winners of the MariaDB Python Hackathon. We sat down with the Innovation track second place winners to learn more about the team and their submission.
Sunny Kumar, Abhijeet Dhanotiya, Anuj Gupta, Tushan Kumar Sinha, and Anand Vyas developed Metadata-Hub, a multimodal semantic search application using MariaDB Vector and OpenAI’s CLIP model. They were interviewed by Robert Silén, Community Advocate and Kaj Arnö, Executive Chairman of MariaDB Foundation. For the recorded interview, watch it on Youtube, or read the interview below.
Before we start, a short explanation: multimodal means embedding not only text but also other media types, like images in this case, allowing users to search for images using text descriptions.
Introducing the Team
Sunny: Hi Robert and Kaj, and first of all I would like to thank both of you guys for this wonderful opportunity. We are really privileged for this. My name is Sunny Kumar and this is my team.
Abhijeet: Hi Robert. Hi Kaj. I’m Abhijeet Dhanotiya, second year B Tech student from Bhopal.
Anuj: Hi Robert. Hi Kaj. I’m Anuj Gupta, second year B Tech student at VIT Bhopal and it’s very nice to see you all here. I’m very happy for the opportunity.
Anand: Hello everyone, my name is Anand and I’m from third year.
Our fifth team member Tushan couldn’t join due to prior commitments, but he helped us a lot during this project as an anchor and oversight for everything.
How did you first hear about the MariaDB hackathon?
Sunny: We heard about this hackathon from one of our seniors at college. He himself was planning to participate, and when we got to know about it we were really excited to participate, so we just did that.
What did you know about MariaDB before?
Sunny: Personally I was not very much aware of MariaDB before this hackathon. So I would say this is a very good initiative to take a very potential database like this to the world through hackathons. When I got to know about it I was very intrigued to work on it. The features it has are phenomenal and the future that it holds is something we should all be looking forward to.
Anand: I worked with Supabase prior, so I know about that, and we know Postgres. So it was easy for us to work on this.
Why did you choose vectors and multimodal search?
Sunny: Both vectors and semantic search are very new popular emerging technologies. We thought vectors really had some potential, especially in a powerful database like MariaDB. We didn’t want to make something which was already made thousands of times. We wanted to make something simple yet powerful that could change how people search on the internet.
The goal of the hackathon was to show capabilities that MariaDB has. So we thought vector and semantic search is something we should go forward with.
Anand: We were also researching Mac OS Spotlight – it’s a privacy-centered multimodal search where you can give a description of what you want and it will find that photo or video for you. We thought we could do that in MariaDB, and so we did.
Kaj: Excellent. That’s exactly what the purpose of this hackathon was – to illustrate how to make existing features of MariaDB work. What makes us particularly interested in your submission is this multimodal thing, because textual search we’ve done many examples of, but now you have multimodal.
What were your expectations on MariaDB Vector and how did it turn out?
Sunny: I had earlier worked on a vector embedding project – a research paper recommendation system which had prominent use of vector embedding and Streamlit. So that was something I was already comfortable with. But in the case of MariaDB it was a lot more comfortable and better to work with because the whole database is designed around these features as a backbone, and the integrative features it comes with if you want to go with AWS or something like that.
Working with embeddings with MariaDB was a lot easier than we thought. It was a very great experience. We got to learn a lot about it. I was not having a very great time with the previous project I was working with, but this time I just got it right with MariaDB.
Did the OpenAI CLIP model work well with MariaDB Vector?
Anuj: CLIP worked very nicely with MariaDB vectors. The embeddings it created and the way it gets stored in MariaDB was very seamless. It didn’t give us any problems. CLIP was also one of the things we found out was kind of an industry standard, so that was also one of the reasons why we chose CLIP over anything else.
How was your experience participating in the hackathon?
Sunny: If I had to say it in a few words, it was a beautiful mess. We were totally immature and honestly we were out of our depth when we participated, but we had this contagious will to just do it – curiosity and almost a delusional belief that we could pull this off. This hackathon was not very aligned with our academic calendar – we were having exams at that time, and after that we were all in different parts of the country. That’s why we decided to work with Docker.
At the end, we had a very big belief in our idea – that it’s the kind of idea you guys are looking for and nothing that is overly repeated. We didn’t have much time to work further on it, but we really believed it could leave a mark on you.
On your side, there’s nothing I could complain about. It was great – the hackathon, the instructions, the deadlines, the 30-day timeline. We had time to work, test, and do everything. The hackathon support team was really great. We had a few confusions and questions, and even after the hackathon was over they were very speedy with the replies. I would even like to thank Robert – he was very good at his job, clearing our queries in seconds.
Kaj: Thank you, those are great compliments.
What are your future plans for the project?
Anand: Right now our main focus is to finish what we have started. We don’t want to leave this as a prototype – we want to polish it into something that works in the real world. We are facing challenges, but we are generally excited. After some time we will start making changes in that repo. I’ve posted on LinkedIn that anyone can contribute. We are ready to take requests from others. The goal is to always take that lesson, grow from there, and come back next year stronger and wiser to finally claim that first spot.
Sunny: We all are believers in the open source community. We believe that open source is something where we could create great stuff together. At the start we were thinking about developing this as an app where people can manage their stuff, but my teammates Abhijeet and Anuj had a great idea that we should pitch this as a service rather than an app, looking at a bigger picture. Any company or service that wants to use semantic searches and MariaDB to power their search engines or search facilities can just use the APIs or services that we provide.
That would be a very big project. I don’t know how much time it would take, but we are taking small steps and improving our repo. It would be open for everybody – we’re just second-year students, we can’t do it all alone. We would be looking forward to the open source community.
Kaj: That’s music to our ears. It’s really interesting, and we really appreciate both your contribution so far and your attitude towards open source and your willingness to take a further look at this.
Thank you!
Team: Thank you for this opportunity!
Robert: Thank you for your contribution and this interview, and best of luck to all of you!
Further reading
- Youtube: Recorded interview with Team Metadata-Hub
- Github repo: Metadata-Hub – Multimodal Embeddings with MariaDB Vector
- Multimodal model CLIP on OpenAI’s site and on Huggingface
- Hackathon blog: The MariaDB BangPypers hackathon winner announcement