Background

Eventual consistency is a consistency model used in many large distributed databases which requires that all changes to a replicated piece of data eventually reach all affected replicas; conflict resolution is not handled and responsibility is pushed up to the application author in the event of conflicting updates [13].

Eventual consistency is a specific form of weak consistency; the storage system guarantees that if no new updates are made to the object, eventually all accesses will return the last updated value [14]. If no failures occur, the maximum size of the inconsistency window can be determined based on factors such as communication delays, the load on the system, and the number of replicas involved in the replication scheme [3].

A few examples of eventually consistent systems:

DNS
Asynchronous master/slave replication on an RDBMS e.g. MariaDB
memcached in front of MariaDB, caching reads

Eventual consistency states that in an updatable replicated database, eventually all copies of each data item converge to the same value. The origin of eventual consistency can be traced back to Thomas’ majority consensus algorithm [12]. The term was coined by Terry et al. [11] and later popularized by Amazon in their Dynamo system, which supported only eventual consistency [7].

The CAP theorem, also called Brewer’s theorem by its author Dr. Erik A. Brewer, was introduced at PODC 2000 [4, 5]. The theorem was formally shown by Gilbert and Lynch [8]. Brewer introduced consistency, availability and partition tolerance as three desired properties of any shared-data system and made the conjuncture that maximally two of them can be guaranteed at one time [6].

In general, this theorem perfectly matches the needs of today’s Internet systems. Ideally we expect a service to be available during the whole time period of network connection by which the service is connected to the network/Internet [1]. Thus if a network connection is available the service should be available as well [9,10]. To achieve good performance, requests need to be processed by a distributed system. If the number of servers are increased the probability of server failure or network failure is also increased. A system therefore needs to take this into account and be designed in such a way that these failures are transparent for the client and the impact of such failure is minimized [2]. The abbreviation of the theorem comes from these three properties:

Consistency: This property requires that each operation executed within a distributed system where data is spread among many servers ends with the same result as if executed on one server with all data.

Availability: This property of a distributed system requires that sending a request to any functional node should be enough for a requester to get the response. By complying with this property a system is tolerant to failure of any nodes caused, for instance, by network throughput issues.

Partition Tolerance: A distributed system consists of many servers interconnected by a network. A frequent requirement is distributing the system across more data centers to eliminate the failure of one of them. During network communication, failures are frequent. Hence, a system needs to be fail-proof against an arbitrary number of failed messages among servers. Temporary communication interruption among a server set must not cause the whole system to respond incorrectly [9].

Eventual consistency is defined as follows:

Definition 1: Eventual consistency.

Eventual delivery: An update executed at one node eventually executes at all nodes.
Termination: All update executions terminate.
Convergence: Nodes that have executed the same updates eventually reach an equivalent state (and stay).

Example 1: Consider a case where data item R=0 is stored on all three nodes. Assume that we have the following sequence of writes and commits: W(R=3) C W(R=5) C W(R=7) C on node0. Now reads from node1 could return R=3 and reads from node2 could return R=5 . This is eventually consistent as long as reads from all nodes eventually return the same value. Note that this final value could be R=3. Eventual consistency does not restrict the order in which the writes must be executed.

MariaDB Demonstration

As already stated, normal master slave setup on MariaDB is eventually consistent. In this article we are interested in a situation where we have a multiple masters setup. We will use MariaDB 10.0. There are several possible topologies that could be considered here but we have selected a ring topology (see Figure 1).

Figure 1: MariaDB ring topology.

In this topology Node0 is master and Node1 is slave for Node0. Similarly, Node2 is slave for Node1. Let’s start configuration of the nodes with Node0:

[mysqld]
port = 4000
socket=/home/jan/mysql/mm/node0/mysql.4000.sock
datadir=/home/jan/mysql/mm/node0
basedir=/usr/local/mysql
log_error=/home/jan/mysql/mm/node0/error.log
binlog_format=ROW
log-bin
log-slave-updates=1
gtid-strict-mode=1
server_id=1
gtid-domain-id=1

Similarly Node1:

[mysqld]
port = 5000
socket=/home/jan/mysql/mm/node1/mysql.5000.sock
datadir=/home/jan/mysql/mm/node1
basedir=/usr/local/mysql
log_error=/home/jan/mysql/mm/node1/error.log
binlog_format=ROW
log-bin
log-slave-updates=1
gtid-strict-mode=1
server_id=2
gtid-domain-id=2

And finally Node3:

[mysqld]
port = 6000
socket=/home/jan/mysql/mm/node2/mysql.6000.sock
datadir=/home/jan/mysql/mm/node2
basedir=/usr/local/mysql
log_error=/home/jan/mysql/mm/node2/error.log
binlog_format=ROW
log-bin
log-slave-updates=1
gtid-strict-mode=1
server_id=3
gtid-domain-id=3

After this is done we can install the MariaDB databases and start the servers.

/usr/local/mysql/scripts/mysql_install_db 
  --datadir=/home/jan/mysql/mm/node0 
  --defaults-file=/home/jan/mysql/mm/node0 
  --user=jan --basedir=/usr/local/mysql

/usr/local/mysql/scripts/mysql_install_db 
  --datadir=/home/jan/mysql/mm/node1 
  --defaults-file=/home/jan/mysql/mm/node1 
  --user=jan --basedir=/usr/local/mysql

/usr/local/mysql/scripts/mysql_install_db 
  --datadir=/home/jan/mysql/mm/node2 
  --defaults-file=/home/jan/mysql/mm/node2 
  --user=jan --basedir=/usr/local/mysql

/usr/local/mysql/bin/mysqld --defaults-file=/home/jan/mysql/mm/node0/my.cnf &
/usr/local/mysql/bin/mysqld --defaults-file=/home/jan/mysql/mm/node1/my.cnf &
/usr/local/mysql/bin/mysqld --defaults-file=/home/jan/mysql/mm/node2/my.cnf &

Now that the servers are up and running lets set up the first master node on Node0:

jan-GE70-0NC-0ND ~ $ /usr/local/mysql/bin/mysql -u root 
> -S /home/jan/mysql/mm/node0/mysql.4000.sock 

Welcome to the MariaDB monitor.  Commands end with ; or g.
Your MariaDB connection id is 3
Server version: 10.0.10-MariaDB-log Source distribution

Copyright (c) 2000, 2014, Oracle, SkySQL Ab and others.

Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.

MariaDB [(none)]> use test;
Database changed
MariaDB [test]> SHOW MASTER STATUS;
+-----------------------------+----------+--------------+------------------+
| File                        | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+-----------------------------+----------+--------------+------------------+
| jan-GE70-0NC-0ND-bin.000001 |      323 |              |                  |
+-----------------------------+----------+--------------+------------------+
1 row in set (0.00 sec)

Fine, now we need to set Node1 as a slave for this with:

jan@jan-GE70-0NC-0ND ~/Desktop $ /usr/local/mysql/bin/mysql 
> -u root -S /home/jan/mysql/mm/node1/mysql.5000.sock 

Welcome to the MariaDB monitor.  Commands end with ; or g.
Your MariaDB connection id is 3
Server version: 10.0.10-MariaDB-log Source distribution

Copyright (c) 2000, 2014, Oracle, SkySQL Ab and others.

Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.

MariaDB [(none)]> use test;
Database changed
MariaDB [test]> CHANGE MASTER TO
    -> MASTER_HOST='127.0.0.1',
    -> MASTER_USER='root',
    -> MASTER_LOG_FILE='jan-GE70-0NC-0ND-bin.000001',
    -> MASTER_LOG_POS=323,
    -> MASTER_PORT=4000;
Query OK, 0 rows affected (0.52 sec)

MariaDB [test]> START SLAVE;
Query OK, 0 rows affected (0.00 sec)

MariaDB [test]> SHOW SLAVE STATUSG
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: root
                  Master_Port: 4000
                Connect_Retry: 60
              Master_Log_File: jan-GE70-0NC-0ND-bin.000001
          Read_Master_Log_Pos: 323
               Relay_Log_File: jan-GE70-0NC-0ND-relay-bin.000002 
                Relay_Log_Pos: 546
        Relay_Master_Log_File: jan-GE70-0NC-0ND-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 323
              Relay_Log_Space: 854
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
                   Using_Gtid: No
                  Gtid_IO_Pos: 
1 row in set (0.00 sec)

MariaDB [test]> SHOW MASTER STATUS;
+-----------------------------+----------+--------------+------------------+
| File                        | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+-----------------------------+----------+--------------+------------------+
| jan-GE70-0NC-0ND-bin.000001 |      323 |              |                  |
+-----------------------------+----------+--------------+------------------+
1 row in set (0.00 sec)

Similarly, set Node2 as slave to Node1:

an@jan-GE70-0NC-0ND ~/Desktop $ /usr/local/mysql/bin/mysql 
> -u root -S /home/jan/mysql/mm/node2/mysql.6000.sock 

Welcome to the MariaDB monitor.  Commands end with ; or g.
Your MariaDB connection id is 3
Server version: 10.0.10-MariaDB-log Source distribution

Copyright (c) 2000, 2014, Oracle, SkySQL Ab and others.

Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.

MariaDB [(none)]> use test;
Database changed
MariaDB [test]> CHANGE MASTER TO
    -> MASTER_HOST='127.0.0.1',
    -> MASTER_USER='root',
    -> MASTER_LOG_FILE='jan-GE70-0NC-0ND-bin.000001',
    -> MASTER_LOG_POS=323,
    -> MASTER_PORT=5000;
Query OK, 0 rows affected (0.55 sec)

MariaDB [test]> start slave;
Query OK, 0 rows affected (0.01 sec)

MariaDB [test]> show slave statusG
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: root
                  Master_Port: 5000
                Connect_Retry: 60
              Master_Log_File: jan-GE70-0NC-0ND-bin.000001
          Read_Master_Log_Pos: 323
               Relay_Log_File: jan-GE70-0NC-0ND-relay-bin.000002
                Relay_Log_Pos: 546
        Relay_Master_Log_File: jan-GE70-0NC-0ND-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 323
              Relay_Log_Space: 854
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 2
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
                   Using_Gtid: No
                  Gtid_IO_Pos: 
1 row in set (0.00 sec)

MariaDB [test]> show master status;
+-----------------------------+----------+--------------+------------------+
| File                        | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+-----------------------------+----------+--------------+------------------+
| jan-GE70-0NC-0ND-bin.000001 |      323 |              |                  |
+-----------------------------+----------+--------------+------------------+
1 row in set (0.00 sec)

And finally, set Node0 as slave to Node2:

jan-GE70-0NC-0ND ~ $ /usr/local/mysql/bin/mysql 
> -u root -S /home/jan/mysql/mm/node0/mysql.4000.sock 

Welcome to the MariaDB monitor.  Commands end with ; or g.
Your MariaDB connection id is 3
Server version: 10.0.10-MariaDB-log Source distribution

Copyright (c) 2000, 2014, Oracle, SkySQL Ab and others.

Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.

MariaDB [(none)]> use test;
Database changed
MariaDB [test]> CHANGE MASTER TO 
    -> MASTER_HOST='127.0.0.1', 
    -> MASTER_USER='root', 
    -> MASTER_LOG_FILE='jan-GE70-0NC-0ND-bin.000001', 
    -> MASTER_LOG_POS=  323, 
    -> MASTER_PORT=6000;
Query OK, 0 rows affected (0.52 sec)

MariaDB [test]> start slave;
Query OK, 0 rows affected (0.01 sec)

MariaDB [test]> show slave statusG
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: root
                  Master_Port: 6000
                Connect_Retry: 60
              Master_Log_File: jan-GE70-0NC-0ND-bin.000001
          Read_Master_Log_Pos: 323
               Relay_Log_File: jan-GE70-0NC-0ND-relay-bin.000002
                Relay_Log_Pos: 546
        Relay_Master_Log_File: jan-GE70-0NC-0ND-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 323
              Relay_Log_Space: 854
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 3
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
                   Using_Gtid: No
                  Gtid_IO_Pos: 
1 row in set (0.00 sec)

MariaDB [test]> show master status;
+-----------------------------+----------+--------------+------------------+
| File                        | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+-----------------------------+----------+--------------+------------------+
| jan-GE70-0NC-0ND-bin.000001 |      323 |              |                  |
+-----------------------------+----------+--------------+------------------+
1 row in set (0.00 sec)

Now let’s create one table and add some data to it from the different nodes:

/usr/local/mysql/bin/mysql test -u root 
  -S /home/jan/mysql/mm/node0/mysql.4000.sock 
  -e "create table a(a integer not null primary key) engine=innodb" 

/usr/local/mysql/bin/mysql test -u root 
  -S /home/jan/mysql/mm/node0/mysql.4000.sock 
  -e "insert into a values(1)" 

/usr/local/mysql/bin/mysql test -u root 
  -S /home/jan/mysql/mm/node1/mysql.5000.sock 
  -e "insert into a values(2)" 

/usr/local/mysql/bin/mysql test -u root 
  -S /home/jan/mysql/mm/node2/mysql.6000.sock 
  -e "insert into a values(3)"

After this all nodes are eventually consistent and return the same result set, for example:

MariaDB [test]> select * from a;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
+---+
3 rows in set (0.00 sec)

From this we can conclude that MariaDB is eventually consistent also with a multiple masters setup when there are no conflicting operations done.

But what happens if there is a conflict? We can test this scenario by trying to insert a duplicate key to the table a. We try to insert a value 5 to both node0 and node2 so that the final commit commands are issued at about the same time.

MariaDB [test]> set autocommit = 0;
Query OK, 0 rows affected (0.00 sec)

MariaDB [test]> insert into a values (5);
Query OK, 1 row affected (0.00 sec)

MariaDB [test]> commit;
Query OK, 0 rows affected (0.27 sec)

Because we used the InnoDB storage engine and autocommit was off, there is no error message shown on both client connections at commit time. This is because MariaDB does not support deferred constraint checks and no error is possible in the following case:

You insert 5 on server at node0, it succeeds.
Before the insert is replicated to server at node2, you, on server node2, insert 5, that also is OK because this is asynchronous replication.
Then the second insert is replicated from node2 to node0, this causes a conflict due to the duplicate key value 5, the replication thread gets the error and rolls back.

Thus the result set is the following on all three nodes:

MariaDB [test]> select * from a;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
| 5 |
+---+
4 rows in set (0.00 sec)

This is also eventually consistent because all servers return exactly the same value and they have executed exactly the same transactions. From the server log we can find out that:

140320 10:48:10 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on
table test.a; Duplicate entry '5' for key 'PRIMARY', Error_code: 1062; handler
error HA_ERR_FOUND_DUPP_KEY; the event's master log
jan-GE70-0NC-0ND-bin.000001, end_log_pos 1009, Internal MariaDB error code:
1062

140320 10:48:10 [Warning] Slave: Duplicate entry '5' for key 'PRIMARY'
Error_code: 1062

140320 10:48:10 [ERROR] Error running query, slave SQL thread aborted. Fix the
problem, and restart the slave SQL thread with "SLAVE START". We stopped at log
'jan-GE70-0NC-0ND-bin.000001' position 897

And from, for example, node0 you can see it with:

MariaDB [test]> show slave statusG
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: root
                  Master_Port: 6000
                Connect_Retry: 60
              Master_Log_File: jan-GE70-0NC-0ND-bin.000001
          Read_Master_Log_Pos: 1036
               Relay_Log_File: jan-GE70-0NC-0ND-relay-bin.000002
                Relay_Log_Pos: 824
        Relay_Master_Log_File: jan-GE70-0NC-0ND-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1062
                   Last_Error: Could not execute Write_rows_v1 event on table
                               test.a; Duplicate entry '5' for key 'PRIMARY',
                               Error_code: 1062; handler error
                               HA_ERR_FOUND_DUPP_KEY; the event's master log
                               jan-GE70-0NC-0ND-bin.000001, end_log_pos 1009
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 897
              Relay_Log_Space: 1271
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 1062
               Last_SQL_Error: Could not execute Write_rows_v1 event on table
                               test.a; Duplicate entry '5' for key 'PRIMARY',
                               Error_code: 1062; handler error
                               HA_ERR_FOUND_DUPP_KEY; the event's master log
                               jan-GE70-0NC-0ND-bin.000001, end_log_pos 1009
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 3
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
                   Using_Gtid: No
                  Gtid_IO_Pos: 
1 row in set (0.00 sec)

As seen from the logs, the problem is that replication between nodes has been stopped. However, there is a way to ignore replication errors caused by application errors by configuring with –slave-skip-error=XXX , and –slave_exec_mode=IDEMPOTENT. The –slave_exec_mode option controls whether IDEMPOTENT or STRICT mode is used in replication conflict resolution and error checking. The IDEMPOTENT mode causes suppression of duplicate-key and no-key-found errors. This mode is needed for multi-master replication and circular replication. Other valid errors caused by the application can be skipped using –slave-skip-error .

To demonstrate, let’s set –slave-skip-error=all and –slave-exec-mode=IDEMPOTENT on all servers and restart them. We can now try to get the servers into different states (i.e. alternative futures). Execute the following on node0:

MariaDB [test]> set autocommit=0;
Query OK, 0 rows affected (0.00 sec)

MariaDB [test]> update a set a = 12 where a = 3;
Query OK, 1 row affected (0.05 sec)
Rows matched: 1  Changed: 1  Warnings: 0

MariaDB [test]> commit;
Query OK, 0 rows affected (0.09 sec)

MariaDB [test]> select * from a;
+----+
| a  |
+----+
|  1 |
|  2 |
|  5 |
|  9 |
| 12 |
+----+
5 rows in set (0.00 sec)

And the following on node1:

MariaDB [test]> update a set a = 15 where a = 3;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

MariaDB [test]> commit;
Query OK, 0 rows affected (0.13 sec)

MariaDB [test]> select * from a;
+----+
| a  |
+----+
|  1 |
|  2 |
|  5 |
|  9 |
| 15 |
+----+
5 rows in set (0.00 sec)

From slave status we do not see any problems:

MariaDB [test]> show slave statusg
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: root
                  Master_Port: 6000
                Connect_Retry: 60
              Master_Log_File: jan-GE70-0NC-0ND-bin.000002
          Read_Master_Log_Pos: 653
               Relay_Log_File: jan-GE70-0NC-0ND-relay-bin.000006
                Relay_Log_Pos: 812
        Relay_Master_Log_File: jan-GE70-0NC-0ND-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 653
              Relay_Log_Space: 1418
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 
                Last_IO_Error: 0
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 3
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
                   Using_Gtid: No
                  Gtid_IO_Pos: 
1 row in set (0.00 sec)

But in the server log there is a warning:

140320 13:33:40 [Warning] Slave SQL: Could not execute Update_rows_v1 event on
table test.a; Can't find record in 'a', Error_code: 1032; handler error
HA_ERR_KEY_NOT_FOUND; the event's master log jan-GE70-0NC-0ND-bin.000002,
end_log_pos 626, Internal MariaDB error code: 1032

This situation is not eventually consistent and MariaDB can’t resolve the situation automatically. If application needs eventual consistency, it needs to resolve this conflict so that all databases again are in the same state that is correct by application rules.

Conclusions

Eventual consistency means that given enough time, over which no changes are performed, all successful updates will propagate through the system and all replicas will be synchronized. At any given time, there is no guarantee that the data accessed is consistent, therefore the conflicts have to be resolved. Using this definition MariaDB is eventually consistent if replication errors are not ignored even in cases when replication is stopped on replication errors and as long as replication is at some point of time (bounded time) continued and all servers return the same state. If replication errors are ignored, applications must correct the case where two or more servers are in different states.

Our original question was: Is MariaDB eventually consistent?

Answer: For most master slave(s) setups where all data is replicated to slaves MariaDB is eventually consistent. For multiple masters setups where only application handled error cases are ignored and where the application makes sure that servers can’t diverge to alternate futures, MariaDB is eventually consistent. However, there are replication configurations where MariaDB is not eventually consistent.

References

[1] Bailis, P., and Ghodsi, A: Eventual consistency today: limitations, extensions, and beyond, In communications of the ACM vol. 56, no. 5, PP. 55-63, May 2013.

[2] Philip A. Bernstein, Sudipto Das: Rethinking Eventual Consistency, SIGMOD’13, June 22–27, 2013.

[3] Bermbach, D. and Tai S: Eventual Consistency: How soon is eventual? In Proceedings of ACM MW4SOC ’11 and 6 other workshop on Service Oriented Computing, New York, December, 2011, no.1.

[4] Brewer, E: PODC keynote. http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf, 2000.

[5] Brewer, E.: Towards Robust Distributed Systems, (invited Talk) Principles of Distributed Computing, Portland, Oregon, SIGOPS, And SIGACT News, July 2000.

[6] Brewer, E.: CAP twelve years later: How the “rules” have changed. IEEE Computer, vol. 45, no. 2, pp. 23-29, February 2012.

[7] Decantia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., and Vogels, W: Dynamo: Amazon’s highly available key-value store. In Proceeding 21st ACM Symposium on Operating Systems Principles (SOSP), pp. 205-220, 2007.

[8] Lynch, S. Gilbert, N: Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. ACM SIGACT News. 2002, 33, 2, p. 5159.

[9] Hale, C.: You can’t sacrifice partition tolerance; Available from http://codahale.com/you-cant-sacrificepartition-tolerance.

[10] Marc Shapiro, Bettina Kemme: Eventual Consistency. Encyclopedia of Database Systems 2009:1071-1072.

[11] Terry, D. B., Demers, A. J., Petersen, K., Spreitzer, M.J., Theimer, M.M., Welch, B. B.: Session guarantees for Weakly Consistent Replicated Data. In PDIS, pp. 140-149, 1994.

[12] Thomas, R. H.: A majority consensus approach to concurrency control for multiple copy databases. ACM Trans. on Database Systems, vol. 4, no. 2, pp. 180–209, June 1979.

[13] Vogels, W.: Scalable Web services: Eventually Consistent, ACM Queue, vol. 6, no. 6, pp. 14-16, October 2009.

[14] Vogels, W.: Eventually consistent, Communications of the ACM, vol. 52, no. 1, pp. 40–44, January 2009.

2 replies on “MariaDB Eventually Consistent?”

koguma says:

2014-03-30 at 14:41

I’m rather disappointed in this article. You took the latest 10.x MariaDB, yet used none of it’s features that are specific to the issue at hand. Rather, you regurgitated the old ring replication with slave skip mode, etc.

I’m confused, why set gtid-strict-mode, and gtid-domain-id in the configs if you’re not going to use it? I kept waiting for it, and waiting for it, and am still waiting for it. What a missed opportunity, maybe have a part 2 for this article that talks about these new features, and not something we’ve all been doing 10 years ago?
1. Jan Lindström says:
  
  2014-04-03 at 19:24
  
  Idea with this blog article was not to celebrate any new MariaDB 10.x features. Idea was to find out is MariaDB eventually consistent and if not when. I used gtid-strict-mode on multi-master setup to make sure that GTIDs produced by masters are unique, if you do not use it you will get error messages to error log. Furthermore, we need to use different domain ids to make GTIDs different.

Comments are closed.