MariaDB semi-sync replication using containers
In the blog MariaDB replication using containers, we showed how to properly replicate data in MariaDB using Docker containers.
We used standard or asynchronous or lazy replication.
The problem with this type of replication is potential data loss if the primary goes down. Transactions that are committed on the primary are not being sent to replicas, and the replica doesn’t commit the changes. Failover from primary to replica in this case may lead to missing transactions relative to the primary.
To overcome these type of errors, there is semi-sync replication that is integrated into MariaDB since 10.4 and fully sync replication, which we plan to implement eventually as MDEV-19140. Another option is to use Galera.
While in fully sync replication all replicas have to verify a single transaction before the primary returns to the session that executed the transaction, in semi-sync replication at least one replica needs to acknowledge that the transaction event is transmitted successfully. That gives faster replication compared to fully-sync replication.
Compared to async replication, semi-sync replication provides improved data integrity, because when the primary receives acknowledgment from at least one replica and commits the changes, we can be sure that the data exists in at least 2 places.
In this blog we first explore:
- how to setup semi-sync replication with demo example.
After that, for more curious readers, we will visualize the following in the form of graphs:
- Standard replication configuration
- Standard replication transaction example
- Semi-sync replication configuration
- Semi-sync replication transaction example
1. Semi-sync demo example with containers
- In this example we will be using stateless application, just as a proof of concept.
- There is a Docker compose file in our mariadb-docker GitHub repository.
- The main point to enable semi-sync on the primary is to set
--rpl_semi_sync_master_enabled
and enable semi-sync on replicas to set--rpl_semi_sync_slave_enabled
.
- The main point to enable semi-sync on the primary is to set
- We will be using GTIDs (that are enabled automatically) and will show how they change in the example below.
- This way replication will start at the position of the last GTID replicated to the replica (seen from the
gtid_slave_pos
system variable).
- This way replication will start at the position of the last GTID replicated to the replica (seen from the
- This section consists of the following:
1.1 Create and check containers
1.2 Check primary
1.3 Check binary logs
1.4 Check replica[s]
1.5 Start replicating
1.1 Create and check containers
- Start containers
$ docker compose -f compose-replication-semisync.yml up
- Check container statuses (make sure that they are healthy)
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
1ddb1fa67fef mariadb:lts "docker-entrypoint.s…" 2 minutes ago Up 2 minutes (healthy) 3306/tcp mariadb-replica-2
dca7a79526ca mariadb:lts "docker-entrypoint.s…" 2 minutes ago Up 2 minutes (healthy) 3306/tcp mariadb-replica-1
addabd306bb4 mariadb:lts "docker-entrypoint.s…" 2 minutes ago Up 2 minutes (healthy) 3306/tcp mariadb-primary
- Check logs
1.2 Check primary
- Check that semi-sync is enabled using the
mariadb
client
$ docker exec -it mariadb-primary mariadb -uroot -psecret -e "select @@rpl_semi_sync_master_enabled;"
+--------------------------------+
| @@rpl_semi_sync_master_enabled |
+--------------------------------+
| 1 |
+--------------------------------+
- Check the rpl_semi_sync_master_timeout variable (default 10 [s]). When this time elapses and the primary doesn’t get any acknowledgment from the replica, it switches back to asynchronous replication.
$ docker exec -it mariadb-primary mariadb -uroot -psecret -e "select @@rpl_semi_sync_master_timeout;"
+--------------------------------+
| @@rpl_semi_sync_master_timeout |
+--------------------------------+
| 10000 |
+--------------------------------+
- Check primary status (check position in the binlog as well as the name of the binlog).
$ docker exec mariadb-primary mariadb -uroot -psecret -e "show master status\G;"
*************************** 1. row ***************************
File: mariadb-bin.000002
Position: 344
Binlog_Do_DB:
Binlog_Ignore_DB:
- Check databases for replication
$ docker exec -it mariadb-primary mariadb -uroot -psecret -e "show databases like '%test%'";
+-------------------+
| Database (%test%) |
+-------------------+
| testdb |
+-------------------+
1.3 Check binary logs
- There are 2 binary logs and an index file
$ docker exec --workdir /var/lib/mysql mariadb-primary bash -c "ls maria*"
mariadb-bin.000001
mariadb-bin.000002
mariadb-bin.index
- The same can be seen from the
mariadb
client
$ docker exec mariadb-primary mariadb -uroot -psecret -e "show binary logs\G;"
*************************** 1. row ***************************
Log_name: mariadb-bin.000001
File_size: 1166
*************************** 2. row ***************************
Log_name: mariadb-bin.000002
File_size: 344
You are free to inspect the binary logs with the mariadb-binlog
client.
- Example of fresh start (comments will be excluded later from example)
$ docker exec mariadb-primary mariadb-binlog /var/lib/mysql/mariadb-bin.000002
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!40019 SET @@session.max_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#231123 12:21:47 server id 1 end_log_pos 256 CRC32 0x21866123 Start: binlog v 4, server v 10.11.6-MariaDB-1:10.11.6+maria~ubu2204-log created 231123 12:21:47 at startup
# Warning: this binlog is either in use or was not closed properly.
ROLLBACK/*!*/;
BINLOG '
20NfZQ8BAAAA/AAAAAABAAABAAQAMTAuMTEuNi1NYXJpYURCLTE6MTAuMTEuNittYXJpYX51YnUy
MjA0LWxvZwAAAAAAAADbQ19lEzgNAAgAEgAEBAQEEgAA5AAEGggAAAAICAgCAAAACgoKAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAEEwQADQgICAoKCgEjYYYh
'/*!*/;
# at 256
#231123 12:21:47 server id 1 end_log_pos 299 CRC32 0xdd5f27d6 Gtid list [0-1-5]
# at 299
#231123 12:21:47 server id 1 end_log_pos 344 CRC32 0xca6cbf75 Binlog checkpoint mariadb-bin.000002
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
- When creating the tables and inserting new values, there will be new changes in this file.
1.4 Check replica[s]
- Check replica’s semi-sync status (let’s check just a single replica. For the second replica, the same should happen)
$ docker exec -it mariadb-replica-2 mariadb -uroot -psecret -e "select @@rpl_semi_sync_slave_enabled;"
+-------------------------------+
| @@rpl_semi_sync_slave_enabled |
+-------------------------------+
| 1 |
+-------------------------------+
- Check replicated database
$ docker exec -it mariadb-replica-1 mariadb -uroot -psecret -e "show databases like '%test%'";
+-------------------+
| Database (%test%) |
+-------------------+
| testdb |
+-------------------+
$ docker exec -it mariadb-replica-2 mariadb -uroot -psecret -e "show databases like '%test%'";
+-------------------+
| Database (%test%) |
+-------------------+
| testdb |
+-------------------+
1.5 Start replicating
- We will perform 2 transactions.
- For each transaction we will check
- the primary status query (status below),
- validate the binlog using the
mariadb-binlog
client (binlog below), - results of replication, validated on the replica (replica below).
- Transaction 1:
create table t(t int)
- Status (we get the new name of the file and the current position in the binlog)
MariaDB [testdb]> show master status;
+--------------------+----------+--------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+--------------------+----------+--------------+------------------+
| mariadb-bin.000003 | 485 | | |
+--------------------+----------+--------------+------------------+
1 row in set (0.000 sec)
- Transaction 1:
create table t(t int)
- Binlog
$ docker exec mariadb-primary mariadb-binlog /var/lib/mysql/mariadb-bin.000003
#231123 12:57:18 server id 1 end_log_pos 299 CRC32 0x0f7f30b3 Gtid list [0-1-5]
# at 299
#231123 12:57:18 server id 1 end_log_pos 344 CRC32 0x263247e6 Binlog checkpoint mariadb-bin.000003
# at 344
#231123 14:05:49 server id 1 end_log_pos 386 CRC32 0x9f9ba209 GTID 0-1-6 ddl
...
create table t(t int)
...
- Transaction 1:
create table t(t int)
- Replica – we see the table is replicated and the IO/SQL threads are running (result from
show replica hosts
).
- Replica – we see the table is replicated and the IO/SQL threads are running (result from
$ docker exec -it mariadb-replica-2 mariadb -uroot -psecret -e "use testdb; show tables; show create table t;"
+------------------+
| Tables_in_testdb |
+------------------+
| t |
+------------------+
+-------+------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+------------------------------------------------------------------------------------------------------------------+
| t | CREATE TABLE `t` (
`t` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci |
+-------+------------------------------------------------------------------------------------------------------------------+
MariaDB [(none)]> show replica status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: mariadb-primary
Master_User: repl
Master_Port: 3306
Connect_Retry: 10
Master_Log_File: mariadb-bin.000003
Read_Master_Log_Pos: 485
Relay_Log_File: mariadb-relay-bin.000002
Relay_Log_Pos: 786
Relay_Master_Log_File: mariadb-bin.000003
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Rewrite_DB:
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 485
Relay_Log_Space: 1097
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: Slave_Pos
Gtid_IO_Pos: 0-1-6
Replicate_Do_Domain_Ids:
Replicate_Ignore_Domain_Ids:
Parallel_Mode: optimistic
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Slave_DDL_Groups: 1
Slave_Non_Transactional_Groups: 0
Slave_Transactional_Groups: 0
1 row in set (0.000 sec)
- Transaction 2:
insert into t values (1),(2),(314),(1618);
- Status
MariaDB [testdb]> show master status;
+--------------------+----------+--------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+--------------------+----------+--------------+------------------+
| mariadb-bin.000003 | 668 | | |
+--------------------+----------+--------------+------------------+
1 row in set (0.000 sec)
- Transaction 2:
insert into t values (1),(2),(314),(1618);
- Binlog (we see the incremented sequence number in the GTID data stream (<domain_id-<server_id>-<seq_no>) for the new transaction)
# at 485
#231123 14:46:19 server id 1 end_log_pos 527 CRC32 0x9ef73274 GTID 0-1-7 trans
START TRANSACTION
..
insert into t values (1),(2),(314),(1618)
...
- Transaction 2:
insert into t values (1),(2),(314),(1618);
- Replica (check replicated data)
$ docker exec -it mariadb-replica-2 mariadb -uroot -psecret -e "select * from testdb.t"
+------+
| t |
+------+
| 1 |
| 2 |
| 314 |
| 1618 |
+------+
# Replica status changes
MariaDB [(none)]> show replica status \G
Slave_IO_State: Waiting for master to send event
Master_Log_File: mariadb-bin.000003
Read_Master_Log_Pos: 668
Relay_Log_File: mariadb-relay-bin.000002
Relay_Log_Pos: 969
Exec_Master_Log_Pos: 668
Relay_Log_Space: 1280
Using_Gtid: Slave_Pos
Gtid_IO_Pos: 0-1-7
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
2. Standard replication configuration
- Here we will visualize how standard replication is configured and later compare with semi-sync configuration.
- To configure the standard replication implemented in the previous blog was straight forward
3. Standard replication transaction
- Here we will visualize the flow of standard replication transactions and will later compare with semi-sync transactions.
- On the thread level (see replication-threads),we can express the flow of active transactions as follows:
- Client makes a transaction
- IO threads notify binlog-dump threads that they are ready for events
- Binlog-dump thread dumps the event[s]
- IO threads write in the relay log and update the master.info file
- SQL threads read from the relay log and execute events.
- The type of replication is asynchronous, which means that we don’t have any feedback from the replicas that the event has been successfully received by the replica, as can be seen from the picture.
4. Semi-sync replication configuration
Before confirming the client request, at least one replica just has to confirm receipt of the data changes (IO thread
), not that the replica has actually applied those data changes. To configure semi-sync replication, we need to stop the replicas and set environment variables on the primary and replicas.
On the primary, set rpl_semi_sync_master_enabled
and on the replicas, set rpl_semi_sync_slave_enabled
. If things are correctly set, rpl_semi_sync_master_status
and rpl_semi_sync_slave_status
should be on for their respective servers.
We will just show updated configuration compared to section 2.
5. Semi-sync replication transaction example
Semi-sync should overcome that problem by introducing an additional primary thread, called “ACK Receiver Thread”. Only one replica needs to confirm that it has received and logged the events, as shown on the following picture.
The process is as follows:
- The client makes a transaction, the connection is suspended until ACK is received [semi-sync]
- Binlog-dump threads set notification to the ACK thread to accept the ACK from the related replica [semi-sync]
- IO threads notify binlog-dump threads that they are ready for events
- Binlog-dump thread dumps the event[s]
- IO threads write in relay log and update master.info file
- IO threads notify ACK thread that the event is received. If there is no ACK before the configured timeout, semi-sync switches to async [semi-sync]
- ACK thread unsuspends the client connection.The client is ready for new transactions [semi-sync]
- SQL threads read from relay log and execute events.
Conclusion and further readings
Special thanks Brandon Nesterenko, Daniel Black and Ian Gilfillan for reviewing this blog post. To get more information about the state of the replicas, one can use show replica hosts
(we are preparing MDEV-21322 to allow more insights for semi-sync states).
If you come across any problems in this blog, with the design, or edge cases that don’t work as expected, please let us know. You are welcome to chat about it on Zulip. As always you can use our JIRA bug/feature request in the MDEV project for any bug/feature request you may encounter.
Hi, thanks for the detail information.
I have followed the instructions and am getting the below error; however, when I tried the isql command, it worked. Any suggestions?
ERROR 1105 (HY000): SQLDriverConnect: [unixODBC][Driver Manager]Can’t open lib ‘/usr/lib/oracle/12.1/client64/lib/libsqora.so.12.1’ : file not found