When backups fail: A mysql binlog race condition

Today I ran into my first MySQL binlog race condition: The initial problem was quite simple: A typical MySQL master->slave setup with heavy load on the master and nearly no load on the slave, which only serves as a hot fallback and job machine, showed differences on the same table on both machines. The differences showed up from time to time: entries that have been deleted from the master were still on the slave.

After several investigations I started examining the MySQL binlog from the master – a file containing all queries that will be transferred to the slave (and executed there if they don’t match any ignore-db-pattern). I grepped for ids of rows that have not been deleted on the slave as I’s interested if the DELETE statement was in the binlog. In order to read a binlog file just use „mysqlbinlog“ and parse the output with grep, less or similar. To my surprise I found the following entries:

$ mysqlbinlog mysql-complete-bin.000335 | grep 1006974
DELETE FROM `tickets` WHERE `id` = 1006974
SET INSERT_ID=1006974/*!*/;

As „SET INSERT_ID“ is a result of an INSERT statement it was clear, that MySQL wrote the INSERT => DELETE statements in the wrong order. As INSERT/DELETE sometimes occur quite fast after each other and several MySQL  threads are open in the same MySQL server, you might run into a rare INSERT/DELETE race condition as the master successfully executes them, while the slave receives them in the wrong order.

As a comparision this is a normal order of INSERT and DELETE (please note that the actual INSERT is not displayed here):

$ mysqlbinlog mysql-complete-bin.000336 | grep 1007729<br />SET INSERT_ID=1007729/*!*/;<br />DELETE FROM `tickets` WHERE `id` = 1007729<br />

Actually this all so far. Lesson learned for me: A mysql binlog might get you into serious trouble when firing a MySQL server with INSERT and DELETE on the same rows as the linear binlog file can fail the correct statement order, which might be a result of different MySQL threads and an unclean log behavior. I have not yet found a generic solution for the problem but I am looking forward to it.

5 Gedanken zu “When backups fail: A mysql binlog race condition

  1. use of row based replication helps avoid this issue. statement based or mixed mode can be prone to this sort of thing, particularly during times of heavy load.

  2. Don’t DELETE 😉

    Serious, use a flag in your SQL table model, isDeleted or something, a bit field or boolean. When you do your normal „housecleaning“ (during the nights, while some importan cronjobs are running), you fire a „delete from foobar where isDeleted=true“…you can then avoid having „inserts and deletes“ in your binlog
    in the wrong order.

    @*8D:
    Row based replication can help you here, that’s right, but it can also be a performance killer.
    Let’s assume: You have a database with a table of 50mil or more entries…delete from where isDeleted=true, results in a statement based replication in exactly one statement..while in a row based replication it would hit hundreds of thousands of data in your binlog.

    The good thing of MySQL 5.1 is, that you can decide what you want to do…mixed mode can help..if you know that deletes should be avoided (This we’re telling as well to our developers, who don’t believe us anyways ;))

    Regards,

    \sh

  3. Pingback: GUS CLEMENS: Know what to get when planning for wedding | Wedding Planning Guide

  4. Joah hier sind wir nun… habe bei Google nach Kleinanzeigen gesucht und bin bei dir gelandet. Will einen Fernseher verkaufen (antik) und habe es bereits auf http://dimida.de probiert… aber da ist anscheinend nichts los. Jetzt gibt es noch ebay aber da kann man niemanden vertrauen. Daher meine Frage: Wo verkaufe ich am besten meine alten Sachen?

Schreib einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind markiert *