Making MariaDB understand MySQL JSON

Good news for MySQL users wishing to upgrade to MariaDB: MariaDB 10.5.7 onwards understands formerly-incompatible MySQL JSON fields! This blog entry explains how to install and use them.

Introducing MYSQL_JSON plugin

Starting from MariaDB 10.5.7 (commit f6549e), it is possible to upgrade from MySQL 5.7 tables containing JSON by loading the MYSQL_JSON data type plugin.

MariaDB and MySQL JSON formats are not the same. In MySQL, the JSON type is a native type, while in MariaDB JSON is just an alias for LONGTEXT. What this means in practice is that data in JSON format are not compatible with each other. Which one is better is irrelevant from the point of view of this blog. This blog is about how to work around this situation, by making it easy to port MySQL JSON fields to MariaDB JSON fields.

Monty suggested having an open door mechanism for MariaDB, so that MySQL JSON could be understood in the MariaDB domain. 

In order to achieve this proposed architecture, we introduce a plugin for 10.5 which will handle all hard work.

In this blog we examine the state of the problem before the MYSQL_JSON plugin, as well as the usage of MYSQL_JSON in practical examples with generated MySQL JSON data. We are going to see how to create JSON data in MySQL, how to use generated data and install MYSQL_JSON plugin manually as well as how to load the plugin using a configuration file. After that we will do an irreversible operation of altering the original MySQL JSON data and convert it to a format unique to MariaDB.

Creating JSON data in MySQL 5.7

mysql> select @@version;
+--------------+
| @@version    |
+--------------+
| 5.7.28 |
+--------------+
1 row in set (0,00 sec)

mysql> create table mysql_json (t json) engine=myisam;
Query OK, 0 rows affected (0,00 sec)

mysql> show create table mysql_json\G
*************************** 1. row ***************************
       Table: mysql_json
Create Table: CREATE TABLE `mysql_json` (
  `t` json DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1
1 row in set (0,01 sec

mysql> insert into mysql_json values 
('{
    "country":"Bosnia and Herzegovina",
    "city":["Zenica", "Sarajevo", "Tuzla", "Mostar", "Banja Luka"],
    "faculty":["Faculty of Mechanical Engineering in Zenica", "Faculty of Electrical Engineering Sarajevo", "Faculty of Electrical Engineering Tuzla"],
    "team": ["Čelik", "Željezničar", "Sarajevo", "Sloboda", "Velež", "Zrinjski", "Borac"]
   }');

mysql> select * from mysql_json\G
*************************** 1. row ***************************
t: {"city": ["Zenica", "Sarajevo", "Tuzla", "Mostar", "Banja Luka"], "team": ["Čelik", "Željezničar", "Sarajevo", "Sloboda", "Velež", "Zrinjski", "Borac"], "country": "Bosnia and Herzegovina", "faculty": ["Faculty of Mechanical Engineering in Zenica", "Faculty of Electrical Engineering Sarajevo", "Faculty of Electrical Engineering Tuzla"]}
1 row in set (0,00 sec)

Reading MySQL 5.7 JSON before MYSQL_JSON plugin

Before 10.5.7, MariaDB would return wrong information in the “.frm” file on MySQL JSON data type, as shown below

MariaDB [test]> show create table mysql_json;
ERROR 1033 (HY000): Incorrect information in file: './test/mysql_json.frm'
MariaDB [test]> select * from mysql_json;
ERROR 1033 (HY000): Incorrect information in file: './test/mysql_json.frm'

Installing the MYSQL_JSON plugin

Now let’s use the generated data directory (“/tmp/mysql-datadir“) from MySQL and use some system variables/command line (CLI) options for the MYSQL_JSON plugin using CLI and config file to start MariaDB.

Installing the plugin using CLI

$ mysqld --no-defaults --datadir=/tmp/mysql-datadir --lc-messages_dir=./sql/share --plugin-dir=<path to directory containing type_mysql_json.so/dll> --plugin-maturity=alpha

Version: '10.5.7-MariaDB'  socket: '/tmp/mysql.sock'  port: 3306  Source distribution

Since we are going to use a plugin called “MYSQL_JSON” whose shared library “type_mysql_json.so” is located in “plugin/type_mysql_json” directory (when working from source), we should specify the path for the system variable plugin_dir (as an inherent security  feature of plugins) to point to where our plugin resides. Since our plugin currently has an alpha maturity level, we have to set the MariaDB plugin_maturity server system variable at least to this level in order to install our plugin. In the above example, we could also specify which plugin from the plugin library to load using the plugin_load_add mysqld option (as done below using a config file), but we will do it manually in this example (using the INSTALL PLUGIN statement).

Let’s start the mysql client on the test data to see what is happening inside our mysql tables:

MariaDB [(none)]> select @@version;
+----------------------+
| @@version            |
+----------------------+
| 10.5.7-MariaDB |
+----------------------+
1 row in set (0.000 sec)

MariaDB [(none)]> select @@plugin_dir;
+--------------------------------------------------------+
| @@plugin_dir                                           |
+--------------------------------------------------------+
|<build_dir>/plugin/type_mysql_json/ |
+--------------------------------------------------------+
1 row in set (0.000 sec)

MariaDB [(none)]> select @@plugin_maturity;
+-------------------+
| @@plugin_maturity |
+-------------------+
| alpha             |
+-------------------+
1 row in set (0.000 sec)

MariaDB [test]> show create table mysql_json;
ERROR 4161 (HY000): Unknown data type: 'MYSQL_JSON'

MariaDB [test]> select * from mysql_json;
ERROR 4161 (HY000): Unknown data type: 'MYSQL_JSON'

MariaDB [test]> select * from information_schema.plugins where plugin_name="mysql_json";
Empty set (0.002 sec)

MariaDB [test]> select * from mysql.plugin;
Empty set (0.001 sec)

Running the show create table or select on table with JSON column results in “Unknown data type: MYSQL_JSON” error message.

The reason for this is the missing plugin as we know.

Looking into the information_schema (IS) plugins table or the show plugins  statement results in no plugin being presented. Now let’s install and verify our plugin (using information_schema.plugins or the show plugins statement). Since we are installing the plugin manually using the INSTALL SONAME statement, verification of the plugin will also be visible in the mysql.plugin table, which is not the case when loading the plugin from a config file.

MariaDB [(none)]> install soname 'type_mysql_json';
Query OK, 0 rows affected (0.002 sec)

MariaDB [(none)]> select * from information_schema.plugins where plugin_name="mysql_json"\G
*************************** 1. row ***************************
           PLUGIN_NAME: MYSQL_JSON
        PLUGIN_VERSION: 0.1
         PLUGIN_STATUS: ACTIVE
           PLUGIN_TYPE: DATA TYPE
   PLUGIN_TYPE_VERSION: 100507.0
        PLUGIN_LIBRARY: type_mysql_json.so
PLUGIN_LIBRARY_VERSION: 1.14
         PLUGIN_AUTHOR: Anel Husaković, Vicențiu Ciorbaru
    PLUGIN_DESCRIPTION: Data type MYSQL_JSON
        PLUGIN_LICENSE: GPL
           LOAD_OPTION: ON
       PLUGIN_MATURITY: Alpha
   PLUGIN_AUTH_VERSION: 0.1
1 row in set (0.002 sec)

MariaDB [test]> select * from mysql.plugin;
+------------+--------------------+
| name       | dl                 |
+------------+--------------------+
| MYSQL_JSON | type_mysql_json.so |
+------------+--------------------+
1 row in set (0.001 sec)

Load the plugin using config file

In order to skip the above manual configuration of the plugin, we can use the following in the config file to load the plugin on startup of the server. Note that in this case mysql.plugin will be empty:

plugin_dir=<path to directory containing type_mysql_json.so/dll> 
plugin_load_add=mysql_json=type_mysql_json
plugin_maturity=alpha

The descriptive errors you get with the plug-in

If we try now to show create table or select the table, we will get a meaningful message, the same in both cases:

MariaDB [test]> show create table mysql_json;
ERROR 1707 (HY000): Table rebuild required. Please do "ALTER TABLE `test.mysql_json` FORCE" or dump/reload to fix it!
MariaDB [test]> select * from mysql_json;
ERROR 1707 (HY000): Table rebuild required. Please do "ALTER TABLE `test.mysql_json` FORCE" or dump/reload to fix it!

The error message shows two possible fixes:

– either fix it using ALTER TABLE with FORCE

– or fix it using mysqldump.
The same message will appear also when invoking mysqlcheck:

$ mysqlcheck -c test mysql_json 

Failed to SHOW CREATE TABLE `mysql_json`
Error: Table rebuild required. Please do "ALTER TABLE `test.mysql_json` FORCE" or dump/reload to fix it

Ways to fix the errors

ALTER TABLE FORCE

Let’s apply what the error message suggests:

MariaDB [test]> ALTER TABLE test.mysql_json FORCE;
Query OK, 1 row affected (0.011 sec)               
Records: 1  Duplicates: 0  Warnings: 0

MariaDB [test]> show create table mysql_json\G
*************************** 1. row ***************************
       Table: mysql_json
Create Table: CREATE TABLE `mysql_json` (
  `t` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1
1 row in set (0.000 sec)

MariaDB [test]> select * from mysql_json\G
*************************** 1. row ***************************
t: {"city": ["Zenica", "Sarajevo", "Tuzla", "Mostar", "Banja Luka"], "team": ["Čelik", "Željezničar", "Sarajevo", "Sloboda", "Velež", "Zrinjski", "Borac"], "country": "Bosnia and Herzegovina", "faculty": ["Faculty of Mechanical Engineering in Zenica", "Faculty of Electrical Engineering Sarajevo", "Faculty of Electrical Engineering Tuzla"]}
1 row in set (0.000 sec)

As can be seen, data from MySQL is successfully converted to MariaDB, and the client is able to read valid data.

MySQL mysqldump

Start the MySQL server and invoke the mysqldump client:

$ mysqldump test>mysql_json_dump.sql

Start MariaDB server with an empty data directory and start mysql client:

MariaDB [test]> show tables;
Empty set (0.000 sec)

Restore original MySQL data in MariaDB:

$ mysql test<mysql_json_dump.sql 

Verification

$ mysql test -e "show create table mysql_json\G select * from mysql_json\G"
*************************** 1. row ***************************
       Table: mysql_json
Create Table: CREATE TABLE `mysql_json` (
  `t` json /* MySQL 5.7 */ CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1
*************************** 1. row ***************************
t: {"city": ["Zenica", "Sarajevo", "Tuzla", "Mostar", "Banja Luka"], "team": ["Čelik", "Željezničar", "Sarajevo", "Sloboda", "Velež", "Zrinjski", "Borac"], "country": "Bosnia and Herzegovina", "faculty": ["Faculty of Mechanical Engineering in Zenica", "Faculty of Electrical Engineering Sarajevo", "Faculty of Electrical Engineering Tuzla"]}

Running mysql_upgrade force

The same as ALTER FORCE can be achieved with mysql_upgrade (below example performs upgrade on 3 MySQL tables):

$ mysql_upgrade test mysql_json
...
test.mysql_json                                    Needs upgrade
test.mysql_json_innodb
error    : Table rebuild required. Please do "ALTER TABLE `mysql_json_innodb` FORCE" or dump/reload to fix it!
test.tt
error    : Table rebuild required. Please do "ALTER TABLE `tt` FORCE" or dump/reload to fix it!

Repairing tables
test.mysql_json                                    OK
`test`.`mysql_json_innodb`                         FIXED
`test`.`tt`                                        FIXED

Exploring the JSON functions

After getting the table with JSON column in MariaDB, we can use a set of JSON functions to work with the JSON field.

Let’s create a temporary (session) table as a working table and use all data stored in the base table.

MariaDB [test]> create temporary table my_temp like mysql_json;
Query OK, 0 rows affected (0.000 sec)

MariaDB [test]> insert into my_temp(t) select * from mysql_json;
Query OK, 1 row affected (0.000 sec)
Records: 1  Duplicates: 0  Warnings: 0

Now let’s see JSON_DETAILED function in action as an pretty way to show JSON structure:

MariaDB [test]> select json_detailed(t) from my_temp\G
*************************** 1. row ***************************
json_detailed(t): {
    "city": 
    [
        "Zenica",
        "Sarajevo",
        "Tuzla",
        "Mostar",
        "Banja Luka"
    ],
    "team": 
    [
        "Čelik",
        "Željezničar",
        "Sarajevo",
        "Sloboda",
        "Velež",
        "Zrinjski",
        "Borac"
    ],
    "country": "Bosnia and Herzegovina",
    "faculty": 
    [
        "Faculty of Mechanical Engineering in Zenica",
        "Faculty of Electrical Engineering Sarajevo",
        "Faculty of Electrical Engineering Tuzla"
    ]
}
1 row in set (0.000 sec)

Let’s see how to use some data using JSON_EXTRACT function, where we get the data based on document path as well as applying indexing on the result set:

MariaDB [test]> select json_extract(t,'$.country'), json_extract(t,'$.city') from my_temp\G
*************************** 1. row ***************************
json_extract(t,'$.country'): "Bosnia and Herzegovina"
   json_extract(t,'$.city'): ["Zenica", "Sarajevo", "Tuzla", "Mostar", "Banja Luka"]

MariaDB [test]> select json_extract(t,'$.city[0]') from my_temp\G
*************************** 1. row ***************************
json_extract(t,'$.city[0]'): "Zenica"

Let’s try to insert the new row with valid JSON object and let’s use JSON_CONTAINS function to get that value:

MariaDB [test]> insert into my_temp values ('{"key1":"value1"}');
Query OK, 1 row affected (0.000 sec)
MariaDB [test]> select * from my_temp where json_contains(t, '"value1"', '$.key1');
+-------------------+
| t                 |
+-------------------+
| {"key1":"value1"} |
+-------------------+
1 row in set (0.000 sec)

Let’s try to insert the new row with not valid JSON object and let’s check with JSON_TYPE and JSON_KEYS function what have we inserted:

MariaDB [test]> insert into my_temp values ('{"key-wrong":}');
Query OK, 1 row affected (0.000 sec)

MariaDB [test]> select json_type(t) from my_temp;
+--------------+
| json_type(t) |
+--------------+
| OBJECT       |
| OBJECT       |
| OBJECT       |
+--------------+
3 rows in set (0.000 sec)

MariaDB [test]> select json_keys(t) from my_temp;
+----------------------------------------+
| json_keys(t)                           |
+----------------------------------------+
| ["city", "team", "country", "faculty"] |
| ["key1"]                               |
| NULL                                   |
+----------------------------------------+
3 rows in set, 1 warning (0.000 sec)

In order to prevent insertion of wrong JSON data we can use CHECK constraint for JSON field such that JSON_VALID function is implied.

Let’s first try to use CHECK constrained on the data set which already has polluted data, as show in the above example.

MariaDB [test]> alter table my_temp add check(json_valid(t));
ERROR 4025 (23000): CONSTRAINT `CONSTRAINT_1` failed for `test`.`my_temp`

As expected, the statement will fail since implied constraint is violated. In order to solve this we have to delete last row where JSON_KEY is NULL, since we know it is wrong row. After that alter table with CHECK constraint again and verify results.

MariaDB [test]> delete from my_temp where json_keys(t) is NULL;
Query OK, 1 row affected, 1 warning (0.000 sec)

MariaDB [test]> alter table my_temp add constraint c1 check(json_valid(t));
Query OK, 2 rows affected (0.001 sec)              
Records: 2  Duplicates: 0  Warnings: 0

MariaDB [test]> show create table my_temp\G
*************************** 1. row ***************************
       Table: my_temp
Create Table: CREATE TEMPORARY TABLE `my_temp` (
  `t` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL,
  CONSTRAINT `c1` CHECK (json_valid(`t`))
) ENGINE=MyISAM DEFAULT CHARSET=latin1
1 row in set (0.000 sec)

If we were using base table instead of temporary table, CHECK constraint could be visible in INFORMATION_SCHEMA.CHECK_CONSTRAINTS table.

Now if we try to insert a wrong data the CHECK constraint will be activated

MariaDB [test]> insert into my_temp values ('{"key-wrong":}');
ERROR 4025 (23000): CONSTRAINT `c1` failed for `test`.`my_temp`

Note that when adding a JSON field in MariaDB table CHECK constraint with JSON_VALID is automatically created.

MariaDB [test]> alter table x add column (j json);
Query OK, 0 rows affected (0.017 sec)

MariaDB [test]> show create table x\G
*************************** 1. row ***************************
       Table: x
Create Table: CREATE TABLE `x` (
  `x` int(11) DEFAULT NULL,
  `j` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL CHECK (json_valid(`j`))
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.000 sec)

For more demo examples there is an video tutorial.

CONCLUSION

Practical examples are shown for a new mechanism to use with JSON types.

The purpose of such a mechanism is to provide MariaDB users an option for consistency of data and to decrease the difference in the confusing status of JSON between MySQL-MariaDB domains.

See also