MariaDB and MySQL’s optimizer does not recognize tautologies and contradictions

In logic, a tautology is an expression which is always TRUE, no matter what values are assigned to variables (col = col). The opposite of tautologies are contradictions: a contradiction is an expression which is always FALSE (col col).

I wondered if MariaDB’s optimizer recognizes tautologies and contradictions, so I played with some stupid queries. The procedure I used to verify the tautologies is simple:


As you probably know, if you execute SHOW WARNINGS after EXPLAIN EXTENDED, it returns the query internally rewritten by MariaDB. This way, it’s easy to verify if a tautology/contradiction has been recognized: WHERE <tautology> should be replaced with: WHERE 1, while WHERE <contradiction> should be replaced with: WHERE 0. Just to be sure, I check if EXPLAIN says we are going to have a full table scan… just to verify a tautology.

It turns out that the optimizer recognize very few tautologies. For example:

col = col
col != col
NOT NOT col = col
(col = val) AND (col  val)

(col = val) OR NOT (col = val)

Tuatologies and contradictions that are NOT recognized: (some of them are partly optimized, however)

col IN (col, ...)
col IN (val1, val2) = col NOT IN (val1, val2)
col IN (val1, val2) = (col = val1 OR col = val2))
(col > val) AND (col < val)
(col = val) = (val = col)
(col AND val) = (val AND col) // etc
(col = val) AND NOT (col = val)
(col = val) OR (col <> val)
(col IS NULL) OR (col IS NOT NULL)
(col = val) XOR NOT (col = val)
(col = val1) = (col = val1) AND (col = val2)

I could have tried more tautologies/contadictions, but if these one are not recognized, I can reasonably say that more complex ones are not recognized.

Please, note that I didn't try these in any other DBMS. So I am NOT saying that other software projects has a superior optimizer. Probably they have not. However, I find these results quite amazing. It is probably rare for a DBMS to meet one of these tautologies in a query. But, I guess that sometimes it happens… and it's really hard for a user to guess that col IN (col, ...) is not properly optimized.

Henrik Ingo commented this post on Facebook, and I wish to share his thoughts with everyone:
It’s easy for us app developers to think that a RDBMS contains some form of AI or general purpose logic engine that will solve our “obvious” where clauses. I’ve had similar surprises, but in real world cases, too. It’s good for app developers to understand that the optimizations that exist in a RDBMS are there only because someone actually spent lot of time implementing them.