Per-table unique FOREIGN KEY constraint names – new feature in MariaDB 12.1

For historical reasons, the handling of FOREIGN KEY constraints in MariaDB Server is somewhat peculiar. Unlike CHECK constraints, which are being enforced at the query execution layer for all tables, FOREIGN KEY constraint definitions will simply be ignored unless the underlying storage engine supports them.

The storage engine is not only responsible for enforcing referential integrity between tables, but also storing the constraint definitions. Currently, InnoDB is the only storage engine that supports foreign key constraints, between persistent, non-partitioned tables.

Partitioning is implemented by a storage engine wrapper that makes each partition or subpartition appear as a separate table in the underlying storage engine. This is one of the reasons why FOREIGN KEY constraints do not work between partitioned tables. This design would technically allow constraints in a partitioned child table to refer to a non-partitioned parent table. Internally, the storage engine could manage this as constraints between pairs of individual tables like child#P#p1 and parent, child#P#p2 and parent, and so on. Implementing references to a partitioned parent table would be a lot trickier, because then a single child table or partition would refer to a non-overlapping union of records in a number of parent partitions.

Problem: Duplicate Constraint Names

There is an annoying implementation detail. Consider the following example:

create table t1(a int, b int, key(a),key(b))engine=innodb;

create table t2(a int, b int, key(a),key(b))engine=innodb;

alter table t2 add constraint c foreign key (b) references t1(a);

alter table t1 add constraint c foreign key (b) references t2(a);

The second ALTER TABLE statement would fail because of a duplicate constraint name c. The names of referential constraints must be unique within a SCHEMA or DATABASE.

We would hit this limitation also when allowing a REFERENCES clause in a partitioned table. Even if InnoDB automatically generated constraint names, these names would be displayed by SHOW CREATE TABLE. It would be very awkward if different names would be displayed for each partition, unless we came up with a (non-standard) way to define constraints for individual partitions.

We encountered this limitation also when making CREATE OR REPLACE TABLE atomic. Internally, this operation consists of creating, renaming, and dropping a table (if it existed). At one point, two copies of the table could exist. We could hit a duplicate key error in the InnoDB data dictionary if a named FOREIGN KEY constraint exists.

Let us try to understand how this problem was introduced and how it can be solved.

How InnoDB Stores Constraints

The InnoDB system tablespace includes a number of tables, which define the data dictionary from the InnoDB point of view. There is significant overlap between this and the common data dictionary that is defined in the tablename.frm files, in particular regarding column and index definitions, which InnoDB stores in the core dictionary tables SYS_TABLES, SYS_COLUMNS, SYS_INDEXES, and SYS_FIELDS. InnoDB is indifferent to any DEFAULT values, CHECK constraints or COMMENT, and the common dictionary does not reflect any hidden InnoDB columns or FOREIGN KEY constraints.

The FOREIGN KEY constraints are stored in the two tables SYS_FOREIGN and SYS_FOREIGN_COLS, which are defined in the InnoDB internal SQL dialect as follows:

CREATE TABLE SYS_FOREIGN(ID CHAR, FOR_NAME CHAR, REF_NAME CHAR, N_COLS INT);

CREATE UNIQUE CLUSTERED INDEX ID_IND ON SYS_FOREIGN (ID);

CREATE INDEX FOR_IND ON SYS_FOREIGN (FOR_NAME);

CREATE INDEX REF_IND ON SYS_FOREIGN (REF_NAME);

CREATE TABLE SYS_FOREIGN_COLS(ID CHAR, POS INT, FOR_COL_NAME CHAR, REF_COL_NAME CHAR);

CREATE UNIQUE CLUSTERED INDEX ID_IND ON SYS_FOREIGN_COLS (ID, POS);

For InnoDB, the general advice is to define the PRIMARY KEY as short as possible. Here, we have PRIMARY KEY(ID) on SYS_FOREIGN. The column name is nice and short, but the contents is a variable-length text string! In SYS_FOREIGN_COLS we have an even longer PRIMARY KEY that comprises a reference to SYS_FOREIGN.ID and a numeric counter.

Were we to design this from scratch, we’d better define the PRIMARY KEY of SYS_FOREIGN to start with the 64-bit SYS_TABLES.ID of the table that contains the REFERENCES clause. This would also make the column REF_NAME redundant and simplify any RENAME of the referencing table. It could have made sense to add a per-table 16-bit constraint identifier. The primary key could thus have been as short as 64+16 bits (10 bytes). In SYS_FOREIGN_COLS the key would additionally have contained the POS field, which could have been shortened to 16 bits. We could have defined a non-indexed NAME column in SYS_FOREIGN, just like we have one in SYS_INDEXES and SYS_COLUMNS. Yes, these names must be unique within a table, but that can easily be enforced in the data dictionary cache.

Changing file formats is always a challenge, because we need to carefully consider and test upgrade and downgrade operations. This is why there hasn’t been any significant change in the format of the InnoDB dictionary tables, and only small tweaks have been done.

MariaDB is derived from MySQL. Originally, the SYS_FOREIGN.ID was generated internally by InnoDB, and it was rather short: an ASCII representation of a global 64-bit counter in two parts: 0_0, 0_1, 0_2, …, 0_4294967295, 1_0, 1_1, …. One had to use this generated identifier when dropping constraints. As noted in MySQL Bug #2167, which was filed in December 2003, this obviously did not work with logical replication, because each replica could generate a different name for the same constraint. As discussed in that ticket, a simple solution would be to compose the SYS_FOREIGN.ID from the schema name, the table name, and the constraint name.

Unfortunately, some corners were cut, and no proper effort was made to separate the table and constraint names. Only the schema name is nicely separated from the table name by a forward slash character /. This character is not allowed in either schema or table names. For anonymous constraints, we have kind of per-table constraint names, which are constructed from schema/table_ibfk_1. These constraint names were not consistently adjusted on RENAME operations, and also a user could specify a ‘misleading’ constraint name T_ibfk_1 in a table U.

In retrospect, using a character that is not allowed for table or schema names, such as /, would have been a better choice. That is, the SYS_FOREIGN.ID could have been composed as schema/table/constraint. Generated constraint names could have been simple decimal numbers starting from 1.

This is almost what we ended up with in MDEV-28933. Because one could be upgrading from a database where a constraint name starts with /, we ended up choosing another impossible separator: the invalid UTF-8 sequence 0xFF. That is, the internal format of SYS_FOREIGN.ID will be one of the following:

  1. Generated names 0_0, 0_1, 0_2, … if the table was created before MySQL 4.0.18
  2. schema/constraint until MariaDB 12.0; generated names table_ibfk_1, …
  3. schema/table\xffconstraint starting with MariaDB 12.1; generated names 1, 2, …

When we display FOREIGN KEY constraint names, we always try to strip a prefix ending in \xff or /. Likewise, in DDL operations that involve constraints, the old internal names will be first stripped to this format, before schema/table\xff is prepended. This allows a smooth upgrade.

While we do not recommend or guarantee downgrades between major versions of MariaDB Server, it should not be a huge problem in this case. The constraint names would typically be wrongly displayed as schema/table; the invalid UTF-8 sequence would be treated as a string terminator. It should be impossible to drop individual constraints of a table, but adding constraints (using the old name format) should work. Any DDL operations such as DROP or RENAME operations should cover everything needed, based on the SYS_FOREIGN.REF_NAME column.

Conclusion and Future Work

Because of our goal to allow simple in-place upgrades of MariaDB Server, we are facing some old design limitations. With some creativity, it is possible to evolve things without forcing drastic measures, such as requiring users to perform a logical dump and restore.

In the long term, both the storage and the processing of the FOREIGN KEY constraints had better be lifted from the storage engine to the SQL layer. This would not only allow the constraints to work across more types of tables but also enable integration with the query optimizer. This would also allow DEFERRED constraints (MDEV-26097) and lift a limitation regarding indexes. InnoDB deliberately refuses constraints that cannot be enforced by a key lookup.

MariaDB 12.1 release is coming soon — your feedback directly shapes its future. Try the preview, test it, and help us make 12.1 even better!