Dynamic replication filters — our wheel will be square!
This is both hilarious and sad. The new MySQL 5.7 milestone release presents a new feature — replication filters are now dynamic. This is a great and long awaited feature, no doubt about it.
In short, for years MySQL slaves could filter the incoming stream of replication events based on the database or table name these events were applicable to. These filters were configured using the my.cnf
file (or command-line), in particular with the following variables:
replicate_do_db replicate_ignore_db replicate_do_table replicate_ignore_table replicate_wild_do_table replicate_wild_ignore_table
Naturally, users wanted to be able to change the values of these options without having to restart the server. And eventually Davi Arnaut implemented it. MariaDB got it as a contribution back in version 5.5.22. Since that release these options in MariaDB have been dynamic system variables and one could change them at will. For example,
set global replicate_do_db = "mydb1,mydb2"; set global replicate_wild_ignore_table = "mydb1.local%";
Now, I understand that sometimes we might implement a certain useful feature first. And I understand that in that case Oracle might want to have it, while not looking like they’re following MariaDB. Of course not! The feature must be totally original. With completely incompatible syntax. Most recent examples would be EXPLAIN FOR CONNECTION
in MySQL 5.7.2 about a year after SHOW EXPLAIN
was introduced in MariaDB, and multi-source replication using “channels”, while MariaDB has already been using named slaves for quite a while.
But really, there should be some logic in it. Admittedly, EXPLAIN FOR CONNECTION
isn’t worse than SHOW EXPLAIN
. And multi-source “channels” look just fine, syntax-wise. Still, if there is a command-line option, there (almost always) is a corresponding system variable, visible in SHOW VARIABLES
. One (almost always) changes the value of the option by assigning a new value to the corresponding variable. There is only one way of doing it.
Alas, we have taken this way in MariaDB. So, by strictly adhering to the NIH principle (and, I suspect, internal Oracle policies), MySQL developers simply had to invent a different way of changing server options. Lo and behold, CHANGE REPLICATION FILTER
statement. Examples are:
change replication filter replicate_do_db=(mydb1,mydb2); change replication filter replicate_wild_ignore_table=('mydb1.local%');
Well done. Common sense? Nope, not in this case.
I’ve already watched this movie and I didn’t
like it. It was called “the browsers war”. Please Oracle, don’t turn
databases into 90s years browsers. A war is never between 2 companies – it’s always against users.
Thanks for pointing that out Sergei, very useful. What do you think about being able to set filters for different repliaction filters individually for each master on a multi-source slave?
Sure. In MariaDB it works. The syntax is what you may expect (same as for named key caches):
set @@connection_name.replication_do_table=…
and the same from my.cnf or command line.
In MySQL it doesn’t seem to work. Admittedly, I didn’t try, but from looking at the source code I don’t see it being implemented. I suppose they’ll eventually do something like
change replication filter for channel xxx replication_do_table=(…)
This will inconsistent data…. in mysql.
So you think that Oracle looked at what MariaDB did and said “let’s do something completely different?” As opposed to just using the same code MariaDB did, which they’ve done before (MariaDB patches have made it into Oracle’s MySQL).
Your assumption is just that. Very likely it has more to do with the fact that replication filters are not your usual variable. If you want more than one value in MySQL, you have to set it more than once in the configuration file. For example, to replicate everything in the foo and bar databases:
replicate_wild_do_table=foo.%
replicate_wild_do_table=bar.%
Now let’s say you wanted to change this to replicate the foo and baz databases. In MariaDB, it looks like you can use commas. But not in MySQL. So they changed how dynamic setting of these were done, to accomodate the fact that this is a multi-value variable – instead of changing SET to try to parse out all the commas, etc.
As for EXPLAIN FOR CONNECTION vs. SHOW EXPLAIN, it seems obvious to me. Think of the command “SHOW EXPLAIN FOR 1234;” What does that mean? show explain for a number? “EXPLAIN FOR CONNECTION 1234;” is more obvious that it’s showing some kind of information for connection 1234.
I guess it’s lose-lose, though – if Oracle would have used MariaDB’s patch, people would say “Oracle’s not doing anything original!” In this case it looks like they’re trying to make things a bit better, and make more sense. If you think Oracle is a multi-million dollar company because they waste time trying to spite other companies, you’re wrong. If the MariaDB code was good and they wanted that feature, they would have used it, like they have with other features.
Nice try 🙂
There are other multi-value command-line options, they are still server variables (e.g. @@debug).
Of course, I don’t think that Oracle is a multi-million dollar company, *because* they waste time trying to spite other companies. They’re a multi-million dollar company for many different reasons. Which still doesn’t mean, they cannot waste time trying to spite other companies.
And Oracle can *not* use MariaDB code — they cannot just take a GPLv2 code and put it in MySQL, one needs to sign a Contributor Agreement first. You know that.
Once it’s GPLv2, they can take it. The point of the Contributor Agreement is to give Oracle the right to distribute it, which GPLv2 already does.
The replicate options aren’t exactly the same in MySQL and MariaDB. In Maria, they are proper server variables, in MySQL they are not. For example:
MariaDB [(none)]> select @@version; show variables like ‘replicate_wild%’;
+——————–+
| @@version |
+——————–+
| 5.5.29-MariaDB-log |
+——————–+
1 row in set (0.00 sec)
+—————————–+——-+
| Variable_name | Value |
+—————————–+——-+
| replicate_wild_do_table | |
| replicate_wild_ignore_table | |
+—————————–+——-+
2 rows in set (0.00 sec)
As expected, they show up as a true server variable. Now let’s look at MySQL 5.6 (although MySQL 5.1 and 5.5 act the same):
mysql> select @@version; show variables like ‘replicate_wild%’;
+————+
| @@version |
+————+
| 5.6.12-log |
+————+
1 row in set (0.00 sec)
Empty set (0.01 sec)
mysql>
——–
So you see, it’s different in MySQL. Not to spite MariaDB, just because it’s actually different.