Intel improving the performance of MariaDB Vector
As you have probably seen in earlier posts, the preview version of MariaDB Vector is out and ready for you to play with. We have had input from several different places during the development of this feature. This, of course, includes hardware manufacturers such as Intel.
In the background, Intel have been prototyping using AVX512 instructions for dot product and bloom filter. Both of these are functions are part of vector searches. If you haven’t heard of these terms, let me try and break them down.
AVX-512 – 512-bit extensions to the Intel Advanced Vector Extension
The AVX512 instructions themselves are CPU specific instructions that are designed to run calculations on large vectors of numbers simultaneously. It is a class of CPU instructions that are part of SIMD (Single Instruction Multiple Data).
Remember MMX that was with the early Pentium CPUs? That was the first widely deployed set of SIMD instructions, and things have come a long way since then.
AVX stands for Advanced Vector Extensions, and they have been around since the Sandy Bridge Intel generation era (around 2011). This added 16 registers that could each hold 128bits (16bytes). Which means you can run a single instruction on 256bytes of data in one go. AVX2 came two years later and doubled the width of the registers to 256bits (32bytes). Then in 2016 AVX512 came along. This not only expanded the width of the registers to 512bits but doubled the number of registers to 32, allowing a single instruction to run on up to 2KB of data.
You could say that Intel have been ready for the current wave of vector processing for quite some time.
Dot product and Bloom filter are maths concepts
Both of these are interesting if you love working with mathematics. Dot product basically takes two sequences of numbers (such as vectors) and creates a single number output.
Bloom filter is an area of statistics which is often used in the database world. Given lots of buckets of numbers, you can use a bloom filter to work out if a bucket might contain the number you are looking for. This is used by engines such as RocksDB as part of its indexing.
AVX512, Intel and MariaDB Vector
Things are moving quite quickly in the MariaDB Vector world, Intel is currently running tests and benchmarks on performance improvements AVX512 could provide. Whilst they were testing things, MariaDB Vector gained AVX512 support in the dot product area.
We at the MariaDB Foundation look forward to the outcome of this testing and gaining even more performance from the vector indexing in MariaDB.