MariaDB semi-sync replication using containers

In the blog MariaDB replication using containers, we showed how to properly replicate data in MariaDB using Docker containers.
We used standard or asynchronous or lazy replication.


The problem with this type of replication is potential data loss if the primary goes down. Transactions that are committed on the primary are not being sent to replicas, and the replica doesn’t commit the changes. Failover from primary to replica in this case may lead to missing transactions relative to the primary.


To overcome these type of errors, there is semi-sync replication that is integrated into MariaDB since 10.4 and fully sync replication, which we plan to implement eventually as MDEV-19140. Another option is to use Galera.
While in fully sync replication all replicas have to verify a single transaction before the primary returns to the session that executed the transaction, in semi-sync replication at least one replica needs to acknowledge that the transaction event is transmitted successfully. That gives faster replication compared to fully-sync replication.
Compared to async replication, semi-sync replication provides improved data integrity, because when the primary receives acknowledgment from at least one replica and commits the changes, we can be sure that the data exists in at least 2 places.


In this blog we first explore:

  1. how to setup semi-sync replication with demo example.

After that, for more curious readers, we will visualize the following in the form of graphs:

  1. Standard replication configuration
  2. Standard replication transaction example
  3. Semi-sync replication configuration
  4. Semi-sync replication transaction example

1. Semi-sync demo example with containers

  • In this example we will be using stateless application, just as a proof of concept.
  • There is a Docker compose file in our mariadb-docker GitHub repository.
    • The main point to enable semi-sync on the primary is to set --rpl_semi_sync_master_enabled and enable semi-sync on replicas to set --rpl_semi_sync_slave_enabled.
  • We will be using GTIDs (that are enabled automatically) and will show how they change in the example below.
    • This way replication will start at the position of the last GTID replicated to the replica (seen from the gtid_slave_pos system variable).
  • This section consists of the following:
    1.1 Create and check containers
    1.2 Check primary
    1.3 Check binary logs
    1.4 Check replica[s]
    1.5 Start replicating

1.1 Create and check containers

  • Start containers
$ docker compose -f compose-replication-semisync.yml up
  • Check container statuses (make sure that they are healthy)
$ docker ps
CONTAINER ID   IMAGE          COMMAND                  CREATED         STATUS                          PORTS      NAMES
1ddb1fa67fef   mariadb:lts    "docker-entrypoint.s…"   2 minutes ago   Up 2 minutes (healthy)          3306/tcp   mariadb-replica-2
dca7a79526ca   mariadb:lts    "docker-entrypoint.s…"   2 minutes ago   Up 2 minutes (healthy)          3306/tcp   mariadb-replica-1
addabd306bb4   mariadb:lts    "docker-entrypoint.s…"   2 minutes ago   Up 2 minutes (healthy)          3306/tcp   mariadb-primary
  • Check logs

1.2 Check primary

  • Check that semi-sync is enabled using the mariadb client
$ docker exec -it mariadb-primary mariadb -uroot -psecret -e "select @@rpl_semi_sync_master_enabled;"
+--------------------------------+
| @@rpl_semi_sync_master_enabled |
+--------------------------------+
|                              1 |
+--------------------------------+
  • Check the rpl_semi_sync_master_timeout variable (default 10 [s]). When this time elapses and the primary doesn’t get any acknowledgment from the replica, it switches back to asynchronous replication.
$ docker exec -it mariadb-primary mariadb -uroot -psecret -e "select @@rpl_semi_sync_master_timeout;"
+--------------------------------+
| @@rpl_semi_sync_master_timeout |
+--------------------------------+
|                          10000 |
+--------------------------------+
  • Check primary status (check position in the binlog as well as the name of the binlog).
$ docker exec mariadb-primary mariadb -uroot -psecret -e "show master status\G;"
*************************** 1. row ***************************
            File: mariadb-bin.000002
        Position: 344
    Binlog_Do_DB: 
Binlog_Ignore_DB:
  • Check databases for replication
$ docker exec -it mariadb-primary mariadb -uroot -psecret -e "show databases like '%test%'";
+-------------------+
| Database (%test%) |
+-------------------+
| testdb            |
+-------------------+

1.3 Check binary logs

  • There are 2 binary logs and an index file
$ docker exec --workdir /var/lib/mysql mariadb-primary bash -c "ls maria*"
mariadb-bin.000001
mariadb-bin.000002
mariadb-bin.index
  • The same can be seen from the mariadb client
$ docker exec mariadb-primary mariadb -uroot -psecret -e "show binary logs\G;"
*************************** 1. row ***************************
 Log_name: mariadb-bin.000001
File_size: 1166
*************************** 2. row ***************************
 Log_name: mariadb-bin.000002
File_size: 344

You are free to inspect the binary logs with the mariadb-binlog client.

  • Example of fresh start (comments will be excluded later from example)
$ docker exec mariadb-primary mariadb-binlog /var/lib/mysql/mariadb-bin.000002
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!40019 SET @@session.max_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#231123 12:21:47 server id 1  end_log_pos 256 CRC32 0x21866123 	Start: binlog v 4, server v 10.11.6-MariaDB-1:10.11.6+maria~ubu2204-log created 231123 12:21:47 at startup
# Warning: this binlog is either in use or was not closed properly.
ROLLBACK/*!*/;
BINLOG '
20NfZQ8BAAAA/AAAAAABAAABAAQAMTAuMTEuNi1NYXJpYURCLTE6MTAuMTEuNittYXJpYX51YnUy
MjA0LWxvZwAAAAAAAADbQ19lEzgNAAgAEgAEBAQEEgAA5AAEGggAAAAICAgCAAAACgoKAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAEEwQADQgICAoKCgEjYYYh
'/*!*/;
# at 256
#231123 12:21:47 server id 1  end_log_pos 299 CRC32 0xdd5f27d6 	Gtid list [0-1-5]
# at 299
#231123 12:21:47 server id 1  end_log_pos 344 CRC32 0xca6cbf75 	Binlog checkpoint mariadb-bin.000002
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
  • When creating the tables and inserting new values, there will be new changes in this file.

1.4 Check replica[s]

  • Check replica’s semi-sync status (let’s check just a single replica. For the second replica, the same should happen)
$ docker exec -it mariadb-replica-2 mariadb -uroot -psecret -e "select @@rpl_semi_sync_slave_enabled;"
+-------------------------------+
| @@rpl_semi_sync_slave_enabled |
+-------------------------------+
|                             1 |
+-------------------------------+
  • Check replicated database
$ docker exec -it mariadb-replica-1 mariadb -uroot -psecret -e "show databases like '%test%'";
+-------------------+
| Database (%test%) |
+-------------------+
| testdb            |
+-------------------+

$ docker exec -it mariadb-replica-2 mariadb -uroot -psecret -e "show databases like '%test%'";
+-------------------+
| Database (%test%) |
+-------------------+
| testdb            |
+-------------------+

1.5 Start replicating

  • We will perform 2 transactions.
  • For each transaction we will check
    • the primary status query (status below),
    • validate the binlog using the mariadb-binlog client (binlog below),
    • results of replication, validated on the replica (replica below).
  • Transaction 1: create table t(t int)
    • Status (we get the new name of the file and the current position in the binlog)
MariaDB [testdb]> show master status;
+--------------------+----------+--------------+------------------+
| File               | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+--------------------+----------+--------------+------------------+
| mariadb-bin.000003 |      485 |              |                  |
+--------------------+----------+--------------+------------------+
1 row in set (0.000 sec)
  • Transaction 1: create table t(t int)
    • Binlog
$ docker exec mariadb-primary mariadb-binlog /var/lib/mysql/mariadb-bin.000003
#231123 12:57:18 server id 1  end_log_pos 299 CRC32 0x0f7f30b3 	Gtid list [0-1-5]
# at 299
#231123 12:57:18 server id 1  end_log_pos 344 CRC32 0x263247e6 	Binlog checkpoint mariadb-bin.000003
# at 344
#231123 14:05:49 server id 1  end_log_pos 386 CRC32 0x9f9ba209 	GTID 0-1-6 ddl
...
create table t(t int)
...
  • Transaction 1: create table t(t int)
    • Replica – we see the table is replicated and the IO/SQL threads are running (result from show replica hosts).
$ docker exec -it mariadb-replica-2 mariadb -uroot -psecret -e "use testdb; show tables; show create table t;"
+------------------+
| Tables_in_testdb |
+------------------+
| t                |
+------------------+
+-------+------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                     |
+-------+------------------------------------------------------------------------------------------------------------------+
| t     | CREATE TABLE `t` (
`t` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci |
+-------+------------------------------------------------------------------------------------------------------------------+
MariaDB [(none)]> show replica status \G
*************************** 1. row ***************************
            Slave_IO_State: Waiting for master to send event
            Master_Host: mariadb-primary
            Master_User: repl
            Master_Port: 3306
            Connect_Retry: 10
        Master_Log_File: mariadb-bin.000003
    Read_Master_Log_Pos: 485
            Relay_Log_File: mariadb-relay-bin.000002
            Relay_Log_Pos: 786
    Relay_Master_Log_File: mariadb-bin.000003
        Slave_IO_Running: Yes
        Slave_SQL_Running: Yes
    Replicate_Rewrite_DB: 
        Replicate_Do_DB: 
    Replicate_Ignore_DB: 
        Replicate_Do_Table: 
    Replicate_Ignore_Table: 
Replicate_Wild_Do_Table: 
Replicate_Wild_Ignore_Table: 
                Last_Errno: 0
                Last_Error: 
            Skip_Counter: 0
    Exec_Master_Log_Pos: 485
        Relay_Log_Space: 1097
        Until_Condition: None
            Until_Log_File: 
            Until_Log_Pos: 0
        Master_SSL_Allowed: No
        Master_SSL_CA_File: 
        Master_SSL_CA_Path: 
        Master_SSL_Cert: 
        Master_SSL_Cipher: 
            Master_SSL_Key: 
    Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
            Last_IO_Errno: 0
            Last_IO_Error: 
            Last_SQL_Errno: 0
            Last_SQL_Error: 
Replicate_Ignore_Server_Ids: 
        Master_Server_Id: 1
            Master_SSL_Crl: 
        Master_SSL_Crlpath: 
                Using_Gtid: Slave_Pos
            Gtid_IO_Pos: 0-1-6
Replicate_Do_Domain_Ids: 
Replicate_Ignore_Domain_Ids: 
            Parallel_Mode: optimistic
                SQL_Delay: 0
    SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
        Slave_DDL_Groups: 1
Slave_Non_Transactional_Groups: 0
Slave_Transactional_Groups: 0
1 row in set (0.000 sec)
  • Transaction 2: insert into t values (1),(2),(314),(1618);
    • Status
MariaDB [testdb]> show master status;
+--------------------+----------+--------------+------------------+
| File               | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+--------------------+----------+--------------+------------------+
| mariadb-bin.000003 |      668 |              |                  |
+--------------------+----------+--------------+------------------+
1 row in set (0.000 sec)
  • Transaction 2: insert into t values (1),(2),(314),(1618);
    • Binlog (we see the incremented sequence number in the GTID data stream (<domain_id-<server_id>-<seq_no>) for the new transaction)
# at 485
#231123 14:46:19 server id 1  end_log_pos 527 CRC32 0x9ef73274 	GTID 0-1-7 trans
START TRANSACTION
..
insert into t values (1),(2),(314),(1618)
...
  • Transaction 2: insert into t values (1),(2),(314),(1618);
    • Replica (check replicated data)
$ docker exec -it mariadb-replica-2 mariadb -uroot -psecret -e "select * from testdb.t"
+------+
| t    |
+------+
|    1 |
|    2 |
|  314 |
| 1618 |
+------+
# Replica status changes
MariaDB [(none)]> show replica status \G
                Slave_IO_State: Waiting for master to send event
               Master_Log_File: mariadb-bin.000003
           Read_Master_Log_Pos: 668
                Relay_Log_File: mariadb-relay-bin.000002
                 Relay_Log_Pos: 969
           Exec_Master_Log_Pos: 668
               Relay_Log_Space: 1280
                    Using_Gtid: Slave_Pos
                   Gtid_IO_Pos: 0-1-7
       Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates

2. Standard replication configuration

  • Here we will visualize how standard replication is configured and later compare with semi-sync configuration.
  • To configure the standard replication implemented in the previous blog was straight forward

3. Standard replication transaction

  • Here we will visualize the flow of standard replication transactions and will later compare with semi-sync transactions.
  • On the thread level (see replication-threads),we can express the flow of active transactions as follows:
  1. Client makes a transaction
  2. IO threads notify binlog-dump threads that they are ready for events
  3. Binlog-dump thread dumps the event[s]
  4. IO threads write in the relay log and update the master.info file
  5. SQL threads read from the relay log and execute events.
  • The type of replication is asynchronous, which means that we don’t have any feedback from the replicas that the event has been successfully received by the replica, as can be seen from the picture.

4. Semi-sync replication configuration

Before confirming the client request, at least one replica just has to confirm receipt of the data changes (IO thread), not that the replica has actually applied those data changes. To configure semi-sync replication, we need to stop the replicas and set environment variables on the primary and replicas.

On the primary, set rpl_semi_sync_master_enabled and on the replicas, set rpl_semi_sync_slave_enabled. If things are correctly set, rpl_semi_sync_master_status and rpl_semi_sync_slave_status should be on for their respective servers.

We will just show updated configuration compared to section 2.

5.  Semi-sync replication transaction example

Semi-sync should overcome that problem by introducing an additional primary thread, called “ACK Receiver Thread”. Only one replica needs to confirm that it has received and logged the events, as shown on the following picture.

The process is as follows:

  1. The client makes a transaction, the connection is suspended until ACK is received [semi-sync]
  2. Binlog-dump threads set notification to the ACK thread to accept the ACK from the related replica [semi-sync]
  3. IO threads notify binlog-dump threads that they are ready for events
  4. Binlog-dump thread dumps the event[s]
  5. IO threads write in relay log and update master.info file
  6. IO threads notify ACK thread that the event is received. If there is no ACK before the configured timeout, semi-sync switches to async [semi-sync]
  7. ACK thread unsuspends the client connection.The client is ready for new transactions [semi-sync]
  8. SQL threads read from relay log and execute events.

Conclusion and further readings

Special thanks Brandon Nesterenko, Daniel Black and Ian Gilfillan for reviewing this blog post. To get more information about the state of the replicas, one can use show replica hosts (we are preparing MDEV-21322 to allow more insights for semi-sync states).

If you come across any problems in this blog, with the design, or edge cases that don’t work as expected, please let us know. You are welcome to chat about it on Zulip. As always you can use our JIRA bug/feature request in the MDEV project for any bug/feature request you may encounter.