MariaDB & K8s: How to replicate MariaDB in K8s

In the previous blog we have seen how to create a statefulset MariaDB application. Also, we learned how replication works in MariaDB in this blog. Now, we will try to create a replicated statefulset application. As good references for creating this blog, I would like to give credit to the Kubernetes documentation as well as an example from Alibaba Cloud.

Configure replication

To replicate a MariaDB application we are going to create a statefulset that will consist of a single init container and one application container. Both containers will be based on the MariaDB image.

The init container is going to be run before the application container, and we are going to use that container to handle all setup related to replication, mainly to mount the volumes with specific files in specific directories, for specific container types (primary or replica). The files used in that process will be obtained from a configuration map (for which we need to mount an additional volume) that will hold the necessary configuration file for primary and replica that will be mounted in /etc/mysql/conf.d/, the global directory for configuration files, as well as SQL files with SQL statements that will be mounted in the docker-entrypoint-initdb.d, statements that will be executed during the first start up of containers.

The application MariaDB container is going to use those volumes with fine-tuned configuration for replication as well as a persistent volume for the data directory.

Configuration files look like this:

# ConfigMap holding information about configuration files for primary/secondary and dockerinit
apiVersion: v1
kind: ConfigMap
metadata:
  name: mariadb-configmap
data:

  primary.cnf: |
    [mariadb]
    log-bin                         # enable binary logging
    log-basename=my-mariadb         # used to be independent of hostname changes (otherwise is in datadir/mysql)

  replica.cnf: |
    [mariadb]
    log-basename=my-mariadb         # used to be independent of hostname changes (otherwise is in datadir/mysql)

  primary.sql: |
    CREATE USER 'repluser'@'%' IDENTIFIED BY 'replsecret';
    GRANT REPLICATION REPLICA ON *.* TO 'repluser'@'%';
    CREATE DATABASE primary_db;

  secondary.sql: |
    # We have to know name of sts (`mariadb-sts`) and 
    # service `mariadb-service` in advance as an FQDN.
    # No need to use master_port
    CHANGE MASTER TO 
    MASTER_HOST='mariadb-sts-0.mariadb-service.default.svc.cluster.local',
    MASTER_USER='repluser',
    MASTER_PASSWORD='replsecret',
    MASTER_CONNECT_RETRY=10;

# Secret holds information about root password
---
apiVersion: v1
kind: Secret
metadata:
    name: mariadb-secret
type: Opaque
data:
  mariadb-root-password: c2VjcmV0 # echo -n 'secret'|base64

# Headless service
---
apiVersion: v1
kind: Service
metadata:
  name: mariadb-service
  labels:
    app: mariadb
spec:
  ports:
  - port: 3306
    name: mariadb-port
  clusterIP: None
  selector:
    app: mariadb

# Statefulset
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mariadb-sts
spec:
  serviceName: "mariadb-service"
  replicas: 3
  selector:
    matchLabels:
      app: mariadb
  template:
    metadata:
      labels:
        app: mariadb
    spec:
      initContainers:
      - name: init-mariadb
        image: mariadb
        imagePullPolicy: Always
        command:
        - bash
        - "-c"
        - |
          set -ex
          echo 'Starting init-mariadb';
          # Check config map to directory that already exists 
          # (but must be used as a volume for main container)
          ls /mnt/config-map
          # Statefulset has sticky identity, number should be last
          [[ `hostname` =~ -([0-9]+)$ ]] || exit 1
          ordinal=${BASH_REMATCH[1]}
          # Copy appropriate conf.d files from config-map to 
          # mariadb-config volume (emptyDir) depending on pod number
          if [[ $ordinal -eq 0 ]]; then
            # This file holds SQL for connecting to primary
            cp /mnt/config-map/primary.cnf /etc/mysql/conf.d/server-id.cnf
            # Create the users needed for replication on primary on a volume
            # initdb (emptyDir)
            cp /mnt/config-map/primary.sql /docker-entrypoint-initdb.d
          else
            # This file holds SQL for connecting to secondary
            cp /mnt/config-map/replica.cnf /etc/mysql/conf.d/server-id.cnf
            # On replicas use secondary configuration on initdb volume
            cp /mnt/config-map/secondary.sql /docker-entrypoint-initdb.d
          fi
          # Add an offset to avoid reserved server-id=0 value.
          echo server-id=$((3000 + $ordinal)) >> etc/mysql/conf.d/server-id.cnf
          ls /etc/mysql/conf.d/
          cat /etc/mysql/conf.d/server-id.cnf
        volumeMounts:
          - name: mariadb-config-map
            mountPath: /mnt/config-map
          - name: mariadb-config
            mountPath: /etc/mysql/conf.d/
          - name: initdb
            mountPath: /docker-entrypoint-initdb.d
      restartPolicy: Always
      containers:
      - name: mariadb
        image: mariadb
        ports:
        - containerPort: 3306
          name: mariadb-port
        env:
        # Using Secrets
        - name: MARIADB_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mariadb-secret
              key: mariadb-root-password
        - name: MYSQL_INITDB_SKIP_TZINFO
          value: "1"
        # Mount volume from persistent volume claim
        volumeMounts:
        - name: datadir
          mountPath: /var/lib/mysql/
        - name: mariadb-config
          mountPath: /etc/mysql/conf.d/
        - name: initdb
          mountPath: /docker-entrypoint-initdb.d
      volumes:
      - name: mariadb-config-map
        configMap:
          name: mariadb-configmap
          #defaultMode: 0544
      - name: mariadb-config
        emptyDir: {}
      - name: initdb
        emptyDir: {}

  volumeClaimTemplates:
  - metadata:
      name: datadir
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 300M

Test the replication

Apply the configuration file and watch the pod’s creation:

$ kubectl apply -f mariadb-sts-replication.yaml
configmap/mariadb-configmap created
secret/mariadb-secret created
service/mariadb-service created
statefulset.apps/mariadb-sts created

$ kubectl get pod -w
NAME            READY   STATUS     RESTARTS   AGE
mariadb-sts-0   1/1     Running    0          14s
mariadb-sts-1   1/1     Running    0          8s
mariadb-sts-2   0/1     Init:0/1   0          2s
mariadb-sts-2   0/1     PodInitializing   0          4s
mariadb-sts-2   1/1     Running           0          6s

To debug specific pod/container use the following commands:

$ kubectl describe pod mariadb-sts-0
$ kubectl logs mariadb-sts-0 -c init-mariadb

Create data on primary

$ kubectl exec -it mariadb-sts-0 -- mariadb -uroot -psecret
Defaulted container "mariadb" out of: mariadb, init-mariadb (init)
MariaDB [primary_db]> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| primary_db         |
| sys                |
+--------------------+
5 rows in set (0.000 sec)
MariaDB [primary_db]> create table my_table (t int); insert into my_table values (5),(15),(25);
Query OK, 0 rows affected (0.031 sec)

Query OK, 3 rows affected (0.004 sec)
Records: 3  Duplicates: 0  Warnings: 0

Check data on replicas

$ kubectl exec -it mariadb-sts-2 -- mariadb -uroot -psecret
MariaDB [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| primary_db         |
| sys                |
+--------------------+
MariaDB [(none)]> use primary_db;
Database changed
MariaDB [primary_db]> show tables;
+----------------------+
| Tables_in_primary_db |
+----------------------+
| my_table             |
+----------------------+
1 row in set (0.000 sec)

MariaDB [primary_db]> select * from my_table;
+------+
| t    |
+------+
|    5 |
|   15 |
|   25 |
+------+
3 rows in set (0.000 sec)

Scale up

$ kubectl scale sts mariadb-sts --replicas=4
statefulset.apps/mariadb-sts scaled

$ kubectl get pod -w
NAME            READY   STATUS    RESTARTS   AGE
mariadb-sts-0   1/1     Running   0          2m52s
mariadb-sts-1   1/1     Running   0          2m46s
mariadb-sts-2   1/1     Running   0          2m40s
mariadb-sts-3   0/1     Pending   0          0s
mariadb-sts-3   0/1     Pending   0          0s
mariadb-sts-3   0/1     Pending   0          2s
mariadb-sts-3   0/1     Init:0/1   0          2s
mariadb-sts-3   0/1     PodInitializing   0          5s
mariadb-sts-3   1/1     Running           0          7s

Check newly created replica

$ kubectl exec -it mariadb-sts-3 -- mariadb -uroot -psecret
MariaDB [(none)]> use primary_db;
MariaDB [primary_db]> show tables;
+----------------------+
| Tables_in_primary_db |
+----------------------+
| my_table             |
+----------------------+
1 row in set (0.000 sec)
MariaDB [primary_db]> select * from my_table;
+------+
| t    |
+------+
|    5 |
|   15 |
|   25 |
+------+
3 rows in set (0.000 sec)

Try to insert new data on primary

MariaDB [primary_db]> insert into my_table values (40),(45);

Check replica mariadb-sts-3

MariaDB [primary_db]> select * from my_table;
+------+
| t    |
+------+
|    5 |
|   15 |
|   25 |
|   40 |
|   45 |
+------+
5 rows in set (0.000 sec)

Also we can check PVC

$ kubectl get pvc
NAME                    STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
datadir-mariadb-sts-0   Bound    pvc-c1676027-6c75-473b-9b46-3d9a7d370fdc   300M       RWO            standard       31m
datadir-mariadb-sts-1   Bound    pvc-7f969265-3d8f-4677-950b-271ea670321e   300M       RWO            standard       30m
datadir-mariadb-sts-2   Bound    pvc-d116e494-5078-46ec-abcb-864fa8ae6b59   300M       RWO            standard       30m
datadir-mariadb-sts-3   Bound    pvc-00aca08a-7a5e-459a-8ae1-a854c4171a27   300M       RWO            standard       28m

Scale down

$ kubectl scale sts mariadb-sts --replicas=2
statefulset.apps/mariadb-sts scaled

$ kubectl get pod -w
mariadb-sts-3   1/1     Terminating       0          78s
mariadb-sts-3   0/1     Terminating       0          79s
mariadb-sts-3   0/1     Terminating       0          79s
mariadb-sts-3   0/1     Terminating       0          79s
mariadb-sts-2   1/1     Terminating       0          4m4s
mariadb-sts-2   0/1     Terminating       0          4m5s
mariadb-sts-2   0/1     Terminating       0          4m5s
mariadb-sts-2   0/1     Terminating       0          4m5s

Note that scaling down will not remove PVC.

Conclusion

The blog showed how to perform MariaDB replication in K8s. You are welcome to chat about it on Zulip.

Read more