MariaDB replication using containers

In this blog we are going to demonstrate how to replicate a MariaDB database that runs in a Docker container (let’s call it primary) to one or more MariaDB servers that run in a Docker container (let’s call them replicas) using binary logging, a method that creates binary log files and an index that contains the record of all changes to the database (both data and structure). You can find an overview of how replication works here and you can find how to setup replication here. An example can be found in this GitHub script.

Configure and start the primary

Before we start the primary container we first have to configure it.

Option file for primary

In my current directory I’m going to create a directory called config-files and inside create a directory called primarycnf that should have a file primary-1.cnf.

$ mkdir -p config-files/primarycnf
$ touch config-files/primarycnf/primary-1.cnf

Write the following into primary-1.cnf:

[mariadb]
log-bin                         # enable binary logging
server_id=3000                  # used to uniquely identify the server
log-basename=my-mariadb         # used to be independent of hostname changes 
                                # (otherwise name is <datadir>/mysql-bin)
#binlog-format=MIXED            #default 

Now let’s examine the actual config file: in order for replication to take place, one first needs to activate the binary log. This is controlled by the log-bin system variable that will create files that will be stored in the data directory path (datadir). Since the log-basename server variable is configured, the binary logs will be stored in files called my-mariadb-bin.<extension number> where the prefix -bin is added to the log-basename value, and the extension number is created (and incremented) when the server starts, logs are flushed, or the maximum binary log size is reached (max_binlog_size). Additionally, this option will create a binary log index my-mariadb-bin.index.

Each server must be uniquely represented in the replication architecture, and we use the server_id system variable for that.

You can change the binary log format if you want. Default is the mixed type, which offers the best of both ROW and STATEMENT-based. Safe statements are replicated via statement-based replication, saving space, while unsafe ones are replicated via ROW based, ensuring consistency across primary and replicas.

Initialization script for the primary container startup

We may execute custom SQL statements during container startup, and for that case we are going to create a folder called primaryinit with a file primaryinit.sql.

$ mkdir primaryinit
$ touch primaryinit/primaryinit.sql

The contents of primaryinit.sql:

CREATE USER 'repluser'@'%' IDENTIFIED BY 'replsecret';
GRANT REPLICATION SLAVE ON *.* TO 'repluser'@'%';
CREATE DATABASE primary_db;

The above SQL will create a custom user with specific replication grants that our replicas are going to use to connect to the primary server. We will also create a sample database primary_db as a test database.

Start the primary container

Let’s write a command that will start the container that runs the primary MariaDB instance.

Below we are using a couple of volumes:

  1. configuration: The directory primarycnf, created above, will be mounted in the /etc/mysql/conf.d directory (all *.cnf files from this directory will be included by the global option file. See /etc/mysql/mariadb.cnf within the container for how that’s done).
  2. initial data: The initialization script that runs during container startup is passed in by mounting the primaryinit directory to the /docker-entrypoint-initdb.d directory
  3. data directory (optional): the data directory of the primary server will create a directory on the host log-files-primary/, corresponding to /var/lib/mysql within the container. It will be created during container startup on the host if it does not exist.

Additionally we added the -w option as a working directory in case you need to log into the container to check the binary logs, which is what we are going to do a bit later.

As environment variables, MARIADB_ROOT_PASSWORD is used to set the root user’s password as well as MYSQL_INITDB_SKIP_TZINFO environment variable is used to skip warnings in the logs. It is all based on the latest mariadb image and run in the background.

$ docker run -d --rm --name mariadb-primary \
-v $PWD/config-files/primarycnf:/etc/mysql/conf.d:z \
-v $PWD/primaryinit:/docker-entrypoint-initdb.d:z \
-v $PWD/log-files-primary:/var/lib/mysql \
-w /var/lib/mysql \
-e MARIADB_ROOT_PASSWORD=secret \
-e MYSQL_INITDB_SKIP_TZINFO=Y \
mariadb:latest

Check the primary logs (to see the entrypoint and that the replica is enabled)

$ docker logs mariadb-primary
...
022-04-05 15:16:06+00:00 [Note] [Entrypoint]: Temporary server started.
2022-04-05 15:16:06+00:00 [Note] [Entrypoint]: Securing system users (equivalent to running mysql_secure_installation)

2022-04-05 15:16:06+00:00 [Note] [Entrypoint]: /usr/local/bin/docker-entrypoint.sh: running /docker-entrypoint-initdb.d/primaryinit.sql
2022-04-05 15:16:06+00:00 [Note] [Entrypoint]: Stopping temporary server
...
2022-04-05 15:16:07 0 [Note] mariadbd: ready for connections.
Version: '10.7.3-MariaDB-1:10.7.3+maria~focal-log'  socket: '/run/mysqld/mysqld.sock'  port: 3306  mariadb.org binary distribution

There are 2 binary logs created and an index.

# Log into the data directory and check for binary logs with log-basename
$ docker exec -it mariadb-primary bash
root@e591a4955306:/var/lib/mysql# ls |grep my-mariadb
my-mariadb-bin.000001
my-mariadb-bin.000002
my-mariadb-bin.index

Execute SQL statements on the container and check master status and binary logs.

$ docker exec -it mariadb-primary mariadb -uroot -psecret -e "select version()"
+-----------------------------------------+
| version()                               |
+-----------------------------------------+
| 10.7.3-MariaDB-1:10.7.3+maria~focal-log |
+-----------------------------------------+
$ docker exec -it mariadb-primary mariadb -uroot -psecret -e "show databases like 'primary%'"
+---------------------+
| Database (primary%) |
+---------------------+
| primary_db          |
+---------------------+

# Check master status
$ docker exec -it mariadb-primary mariadb -uroot -psecret -e "show master status"
+-----------------------+----------+--------------+------------------+
| File                  | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+-----------------------+----------+--------------+------------------+
| my-mariadb-bin.000002 |      347 |              |                  |
+-----------------------+----------+--------------+------------------+

# Show binary logs
$ docker exec mariadb-primary mariadb -uroot -psecret -e "show binary logs\G;"
*************************** 1. row ***************************
 Log_name: my-mariadb-bin.000001
File_size: 824
*************************** 2. row ***************************
 Log_name: my-mariadb-bin.000002
File_size: 347

Since we are running MariaDB 10.7, we can use the utility mariadb-binlog (previous mysqlbinlog) to check for binary logs (note that there is no need to set the path for the binary log since the working directory of the container is the data directory). You can read more about mariadb-binlog and the list of options.

# To check the binary log use mariadb-binlog utility
$ docker exec -it mariadb-primary mariadb-binlog my-mariadb-bin.000001

# Specify exact position of the binlog
$ docker exec -it mariadb-primary mariadb-binlog --start-position=702 --stop-position=703 --base64-output=never -d primary_db my-mariadb-bin.000001

Now we have completed the setup of our primary server!

Configure and start the replica

Option file for replica

As we have done with the primary, we need to configure secondary(ies) with option files. In the config-files directory, create the secondary-1 directory containing the file secondary-1.cnf.

$ mkdir -p config-files/secondary-1
$ touch config-files/secondary-1/secondary-1.cnf

Contents of secondary-1.cnf:

[mariadb]
server_id=3001                  # used to uniquely identify the server
log-basename=my-mariadb         # used to be independent of hostname changes 
                                # (otherwise name is <datadir>/mysql-bin)
replicate_do_db=primary_db      # replicate only this DB
#binlog-format=MIXED            #default 

Everything is the same as our primary configuration, except we are using a different server_id and we are pointing to a specific database we want to replicate with replicate_do_db system variable (this is an optional step – without it all databases will be replicated).

Initialization script for the replica container startup

Before proceeding with this step, one needs to find out the IP of the primary container, since we are using a virtual bridge network, where containers can be accessed only by their private IP. One needs to create a user-defined network to reference a container by its hostname. To find the IP, run following command

$ docker exec mariadb-primary cat /etc/hosts
127.0.0.1	localhost
::1	localhost ip6-localhost ip6-loopback
fe00::0	ip6-localnet
ff00::0	ip6-mcastprefix
ff02::1	ip6-allnodes
ff02::2	ip6-allrouters
172.17.0.2	e591a4955306

Note IP 172.17.0.2 obtained.

Here we are going to create the script that will be an entrypoint for our replica container. Create the folder secondaryinit and inside it create a file replinit.sql that will have the replication command change master to (a full list of replication commands can be found here). Note that we used the IP of the primary container as a master_host option.

$ mkdir -p secondaryinit/
$ touch secondaryinit/replinit.sql

replinit.sql contents:

CHANGE MASTER TO
  MASTER_HOST='172.17.0.2',
  MASTER_USER='repluser',
  MASTER_PASSWORD='replsecret',
  MASTER_PORT=3306,
  MASTER_CONNECT_RETRY=10;

Start the replica container

To start the replica/secondary container, run

$ docker run -d --rm --name mariadb-secondary-1 \
-v $PWD/config-files/secondary-1:/etc/mysql/conf.d:z \
-v $PWD/secondaryinit:/docker-entrypoint-initdb.d:z \
-v $PWD/log-files-secondary-1:/var/lib/mysql \
-w /var/lib/mysql \
-e MARIADB_ROOT_PASSWORD=secret \
-e MYSQL_INITDB_SKIP_TZINFO=Y \
mariadb:latest

Check the volumes and compare with volumes for primary (similar, right?). Observe the logs created on secondary

$ docker logs mariadb-secondary-1
...
2022-04-05 15:34:20+00:00 [Note] [Entrypoint]: Temporary server started.
2022-04-05 15:34:20+00:00 [Note] [Entrypoint]: Securing system users (equivalent to running mysql_secure_installation)
2022-04-05 15:34:21+00:00 [Note] [Entrypoint]: /usr/local/bin/docker-entrypoint.sh: running /docker-entrypoint-initdb.d/replinit.sql
2022-04-05 15:34:21 5 [Note] Master connection name: ''  Master_info_file: 'master.info'  Relay_info_file: 'relay-log.info'
2022-04-05 15:34:21 5 [Note] 'CHANGE MASTER TO executed'. Previous state master_host='', master_port='3306', master_log_file='', master_log_pos='4'. New state master_host='172.17.0.2', master_port='3306', master_log_file='', master_log_pos='4'.
2022-04-05 15:34:21+00:00 [Note] [Entrypoint]: Stopping temporary server
...
2022-04-05 15:34:22 4 [Note] Slave I/O thread: Start asynchronous replication to master 'repluser@172.17.0.2:3306' in log '' at position 4
2022-04-05 15:34:22 5 [Note] Slave SQL thread initialized, starting replication in log 'FIRST' at position 0, relay log './my-mariadb-relay-bin.000001' position: 4
2022-04-05 15:34:22 0 [Note] mariadbd: ready for connections.
Version: '10.7.3-MariaDB-1:10.7.3+maria~focal'  socket: '/run/mysqld/mysqld.sock'  port: 3306  mariadb.org binary distribution
2022-04-05 15:34:22 4 [Note] Slave I/O thread: connected to master 'repluser@172.17.0.2:3306',replication started in log 'FIRST' at position 4

So just from the logs we can find out about the status of our replication. We see that the replica is connected to the primary. We can also verify that from the primary’s log:

$ docker logs mariadb-primary
...
2022-04-05 15:34:22 7 [Note] Start binlog_dump to slave_server(3001), pos(, 4), using_gtid(0), gtid('')

To confirm that replication is indeed working, let’s check the existence of the primary_db database as well as the binary logs

# Check the database
$ docker exec -it mariadb-secondary-1 mariadb -uroot -psecret -e 'show databases like "primary%"'
+---------------------+
| Database (primary%) |
+---------------------+
| primary_db          |
+---------------------+

One can also run the command show slave status and get the following results:

$ docker exec -it mariadb-secondary-1 mariadb -uroot -psecret -e 'show slave status\G'
*************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                   Master_Host: 172.17.0.2
                   Master_User: repluser
                   Master_Port: 3306
                 Connect_Retry: 10
               Master_Log_File: my-mariadb-bin.000002
           Read_Master_Log_Pos: 347
                Relay_Log_File: my-mariadb-relay-bin.000004
                 Relay_Log_Pos: 651
         Relay_Master_Log_File: my-mariadb-bin.000002
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
               Replicate_Do_DB: primary_db
                          [...]
           Exec_Master_Log_Pos: 347
                          [...]
       Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
                          [...]

In this case we don’t have binary logs on the replica, but the server can be primary and replica at the same time if we want. If we wanted to have binary logs on the replica, so the current replica may act as a primary for some other replica, we should set the log_bin option in the option file on our intermediate “primary” (secondary-1.cnf). If we wanted to have updates from the source primary saved in a replica binary log, we should configure the log_slave_updates system option for that.

Test the primary/replica

Let’s insert some data into the database on primary (names of those in the MariaDB Foundation team) and check the master status (note that our binlog position was 347).

# Insert data
$ docker exec -it mariadb-primary mariadb -uroot -psecret -e 'USE primary_db; INSERT INTO primary_db.primary_tbl values  ("Anna"), ("Andreia"), ("Kaj"), ("Monty"), ("Ian"), ("Vicentiu"), ("Daniel"), ("Faustin"),("Vlad"),("Anel");'

# Check status 
$ docker exec -it mariadb-primary mariadb -uroot -psecret -e "show master status"
+-----------------------+----------+--------------+------------------+
| File                  | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+-----------------------+----------+--------------+------------------+
| my-mariadb-bin.000002 |      817 |              |                  |
+-----------------------+----------+--------------+------------------+

Check the data on the replica

$ docker exec -it mariadb-secondary-1 mariadb -uroot -psecret -e 'select * from primary_db.primary_tbl;'
+----------+
| name     |
+----------+
| Anna     |
| Andreia  |
| Kaj      |
| Monty    |
| Ian      |
| Vicentiu |
| Daniel   |
| Faustin  |
| Vlad     |
| Anel     |
+----------+

You can find the replication example on GitHub.

That’s it. The same works if you add more than one replica. Actually, you can try it as a homework assignment 🙂 (hint: you have it in the script on GitHub).

Conclusion and future work

The blog showed how to perform Primary/Replica replication using binary logging locally with containers. In some of the later blogs we will show how to use Global transaction ID (GTID).

You are welcome to chat about it on Zulip.