My 2019 Database Wishlist

2019

Last year I published my 2018 Database Wishlist, which I recently revisited to check what happened and what didn’t. Time for a 2019 wishlist.

I am not going to list items from my 2018 list, even if they didn’t happen or they partially happened. Not because I changed my mind about their importance. Just because I wrote about them recently, and I don’t want to be more boring than I usually am.

External languages for MySQL and MariaDB

MariaDB 10.3 implemented a parser for PL/SQL stored procedures. This could be good for their business, as it facilitates the migration from Oracle. But it isn’t an answer to the community request of supporting external languages, like C or Python.

Oracle theoretically allows to use languages supported by GraalVM in MySQL. But unspecified legal problems seem to stop them from releasing this. In any case, unfortunately, this feature is only available on GraalVM.

External languages are really desirable. Antony Curtis wrote a patch for this years ago, but neither MySQL or MariaDB included it. Ronal Bouman wrote mysqlv8udfs, a UDF to run JavaScript code via Google Chrome’s JavaScript engine, but it was never included in MySQL or MariaDB.

Please, Oracle and MariaDB: do it. Don’t just start another discussion in a mailing list, do it for real.

ClickHouse without ZooKeeper

ClickHouse needs ZooKeeper to setup a cluster. ZooKeeper is based on the JVM and it’s hard to use. Having to use it because there is no alternative is always quite annoying.

In their repo, there is a feature request to support Consul instead of ZooKeeper. To recap: the main problem is that they use some ZooKeeper unique features, but they plan to refactor the ZooKeeper client library, and after that it is possible they they will implement this feature.

I wrote about Consul some time ago, when I was a Percona consultant. I consider it a very good solution. The other alternative is etcd. It’s worth noting that, despite it being widely used, the CNCF still considers it as an incubating project.

Sphinx: implement JOINs

Sphinx seems to be forgot by many. It is not trendy anymore. There have not been cool titles on the tech sites involving Sphinx for years. So, is it slowly dying? No, it’s simply stable.

There are more trendy alternatives nowadays, yes. Like ElasticSearch and Solr. So why do I care about Sphinx? Well, first, it’s quite KISS (keep it simple, stupid!). It doesn’t have lots of features just because it was theoretically possible to implement them. Sorry for repeating myself, but basically… it’s stable. Second, it supports a subset of SQL, it’s kind of relational, and its language is compatible with MySQL. It could invent a new model and a new language that are supported by nothing else in the world, but they (and not many other modern database vendors) realised that it wouldn’t be a good thing.

But it misses JOINs. I don’t want to run complex JOINs on it, it will never be the right tool for that. Still, this would open more opportunities for Sphinx users. They always mentioned JOINs as something “currently missing”, so I still hope to see them implemented.

PostgreSQL: setup a real bug tracker

As I already mentioned, PostgreSQL doesn’t have a bug tracker. There is a mailing list. A problem that this surely causes is that it’s impossible to do structured searches – for example: get a list of confirmed bugs in version 11 involving foreign keys. I suspect there is another major problem: this could prevent some people from reporting bugs.

PostgreSQL is a great project, but please take bugs seriously.

Percona, be more conservative about your unique features

Every major version removes some features from the previous one. I understand why you do that, and I appreciate it. You want to keep the delta between Percona Server and MySQL as small as possible. If a feature is not used by enough customers, it implies some unjustified difference to maintain. I know that this policy is important to maintain your fork’s quality high.

Yet, I feel that you tend to remove too much stuff. Suppose I start to use a feature now and after some time I have to stop. If I think about it, I wish you didn’t implement it at all in the first place. So my point is not necessarily “maintain more features”, it could be “develop less features” as well.

Open source databases and cloud providers

We have read about some open source databases going proprietary because they spend money to innovate, while cloud providers simply take their customers away without giving anything in return. In particular, Amazon modifies open source software to create its own products, and sells them without paying the original vendors.

What can I say about Amazon… nothing, just a sarcastic “thank you”.

But I’ve something to ask database vendors. Do you know what MaxScale is? If not, it’s because this strategy of going proprietary is not as smart as you may think. Users will simply move away and forget you. Some of those users are paying customers. But even many of those who never paid you did contribute to your product – and indirectly, to your company’s incomes. How? With bug reporting, by writing technical contents about your products, by talking about how good your software is, etc. I don’t know of any estimation of how much these things contribute to a company’s economy, but I would be very interested in reading such a thing.

Federico

Advertisements

SQL Common Errors: confusing WHERE and ON

You probably know the ON clause of SQL JOINs. It is a condition that determines which rows from a table are related to a row from another table.

Conceptually, this is extremely different from the WHERE clause, that determines which row combinations will be returned to the client. However, sometimes a condition can be moved from the WHERE clause to the ON clause, because it will still prevent rows from being excluded. But please keep on reading, because the TL;DR of this post is: don’t do that.

Actually, why would one move conditions from WHERE to ON? Because many developers think that this is an optimization. They believe that the ON clause is executed before the WHERE clause, so moving the conditions will cause them to be evaluated at an earlier stage, avoiding some useless work.

Now, one may actually show some examples where moving a condition actually speeds up a query. This can happen in some uncommon cases, and the root cause is that the optimizer is not able to find the optimal execution plan. Don’t focus on the ON clause. Just focus on building proper indexes or finding out why the optimizer is not using them. But this is a wider topic that will not be covered here.

ON means not WHERE

Let me show you an example. I used MariaDB here, but I expect the same results with all DBMSs I know.

SELECT a.full_name, b.title
    FROM author a
    LEFT JOIN book b
        ON a.id = b.author_id
    WHERE b.title LIKE 'the%'
;
+------------+-----------------------------+
| full_name  | title                       |
+------------+-----------------------------+
| H.G. Wells | The invisible man           |
| H.G. Wells | The Island of Doctor Moreau |
+------------+-----------------------------+
2 rows in set (0.000 sec)

We get a list of the books whose title starts with “the” (case-insensitive), and their authors. As you can see, the ON clause is used to join the rows correctly, and the WHERE clause determines which rows are to be filtered in.

In this database, I inserted another author: Edgar Allan Poe. But he doesn’t appear, because none of the books I inserted starts with “The”.

But what if I move the WHERE condition to ON?

SELECT a.full_name, b.title
    FROM author a
    LEFT JOIN book b
        ON a.id = b.author_id
        AND b.title LIKE 'The%'
;
+-----------------+-----------------------------+
| full_name       | title                       |
+-----------------+-----------------------------+
| H.G. Wells      | The invisible man           |
| H.G. Wells      | The Island of Doctor Moreau |
| Edgar Allan Poe | NULL                        |
+-----------------+-----------------------------+
3 rows in set (0.000 sec)

Edgar Allan Poe appears! Why? SQL is supid!? MariaDB is buggy?! These are typical reactions, but… no, the query is wrong.

This is a LEFT JOIN. Rows from the left tables are returned even if they never match the ON clause.

With an INNER JOIN, actually you will not notice any difference. But what if you are looking for books who title starts with “the” OR whose author name starts with a letter < "M"? Try it yourself, as an exercise. You will see that you cannot always fix the problem by using an INNER JOIN: you need to use ON properly : -)

Federico

Revisiting my 2018 Database Wishlist

It is December and 2018 is going to end. In January, when it just started, I wrote my 2018 Database Wishlist. Probably next January I’ll write another list. But first, it makes sense to review the last one. If some of my wishes actually happen, I really should know that – I don’t want to miss something nice, or forget topics that I considered interesting less than one year ago. Well, let’ s do some copy/paste and some googling…

More research on Learned Indexes

I’m not sure if more research actually happened – I hope so, and I hope that we’ll see its results at some point. At least, it seems that the topic was not forgotten. It was mentioned at least at Artificial Intelligence conference in May, and at Stanford Seminar in October.

It’s worth noting that Wikipedia still doesn’t have a page for Learned Index.

Progress on using ML for database tuning

The Overtune website didn’t publish anything new – it just removed some previously available information. It’s now possible to register to become a tester for a beta version, so it is reasonable to think that there has been some progress. The repository is actually active. No new public articles, so bad. I’ll definitely stay tuned.

More research on stored functions transformation

I can’t find anything newer than the original paper, except for a post from Microsoft Research. I found no evidence that anyone not working at Microsoft considers this research interesting.

Galera 4

MariaDB 10.4 plans mention that Galera 4 will be included. But this could be just another optimistic hypothesis, so basically… I still see nothing new.

Transactional DDL in the MySQL ecosystem

MDEV-4259 – transactional DDL is still open, no fix version was set. They indeed dedicated resources to MDEV-11424 – Instant ALTER TABLE of failure-free record format changes.

Oracle doesn’t say much about what will be in future MySQL versions. However, someone from Oracle said that atomic ALTER TABLE is a foundation for transactional DDL, which could mean that they’re closer to that than MariaDB. So, let’s hope we’ll see this feature in the next version – but there was no claim from them about that.

Engines, engines, engines

The storage engines I mentioned are still available for MySQL and MariaDB, or come with Percona Server. Oracle still didn’t kill MyISAM. However, no, I didn’t see any big change in SPIDER.

More progress on Tarantool and CockroachDB

Apparently, Tarantool 2 (still not stable) fixed a lot of bugs and improved its SQL support. This includes removing ON CONFLICT REPLACE for UNIQUE indexes, that is also problematic for MySQL.

Cockroach actually added a lot of features. Amongst other things, I want to report the cost-based optimizer and the CDC. IMPORT command allows to import dumps from MySQL and PostgreSQL, as well as CockroachDB itself and CSV files.

Final thoughts

Some things simply didn’t happen.

Learned index structures and machine learning to tune database performance apparently weren’t forgotten, so hopefully we’ll see something interesting in the future.

Tarantool and CockroachDB show interesting enhancements. MySQL third-party storage engines didn’t introduce anything fancy, but keep on doing a good job.

Federico

On Percona Community Blog

I liked Percona Community Blog from the beginning. First of all, the idea is great. There is no other community blog for the MySQL ecosystem.

Well, Oracle has its own planet.mysql.com – and I have to say, they are correct: as far as I know, they never censored posts about MariaDB and Percona Server, nor opinions that they don’t like. I wrote heavy criticism about Oracle, sometimes using strong terms (“another dirty trick”), but they never censored me. I like to be fair regardless who/what I’m talking about, so this is a good time to spend some good words about them. Not the first time, anyway. That said, their blogroll only advertises a very small number of blogs. Very good ones of course (except for mine?), but this has the inevitable side effect of obfuscating the rest of the world. If John Doe writes an enlightening post about MySQL, I’ll never read it, because everything I need to know appears on Planet MySQL.

Percona Community Blog may have the same side effect… or maybe not, or the effect could be weaker at least. I saw outstanding contents by my well-known friend JF, yes. But I also saw articles by people that I don’t know, and I never saw on Planet MySQL. So I believe PCB is proving itself quite inclusive.

I started to publish some contents there. First, I used it to promote my talk at Percona Live Europe in Frankfurt, MariaDB System-Versioned Tables. Then I published an article on the same topic, Some notes on MariaDB system-versioned tables. Even if recently I’m not writing as much as I used to do some years ago, I believe that you will see more posts from me in the near future. PCB is a great place to publish stuff.

One could object that PBC contains the name of a private company and is hosted on its own website, so it is not genuinely a community project. Which is absolutely true. But if you want to see something better in the MySQL ecosystem, you will have to create it, because currently it doesn’t exist.

So, is this blog going to die? Absolutely not. This is my personal space. Any third-party website, no matter how good, can disappear or delete our contents, and there is nothing you can do about it. A personal space is there till you want it to be there. I don’t know how I will decide what will go here and what will go on PCB, I’ll have to think more about it.

Furthermore, being in several places is a form of redundancy, if we decide that our presence on the web is important for us. That is why I always keep my profiles on LinkedIn and Facebook a bit active, and some days ago I even created a YouTube playlist with my webinar recordings – only three, two of which in Italian, but still.

Well, enough babbling. Just a final word: if you have something interesting to say about open source databases, you should definitely propose it to PBC. Making it even more interesting is up to us!

Federico

Open source databases and bug trackers

All software has bugs. I rarely feel scared because of a bug – even famous ones that are sometimes in the news. People should manage software wisely, and be able to workaround bugs they hit.

Instead, I’m more concerned about how vendors treat their bugs. In particular, I care about transparency, and how projects process their bugs.

Here’s what I mean.

Transparency:

  1. Ability to search for new, active, and closed bugs.
  2. Public tests to check yourself if bugs are fixed, and if regression bugs made their way back into the code.
  3. Public version maintenance policy – if I use an old version, I should know if bugfixes are still backported.
  4. Optional: See developers comments, see their activity on a specific bug.
  5. Optional: See which version should contain a fix (if planned) and when that version will be released (if planned).

I want to highlight the word Optional. Points 1, 2 and 3, in my opinion, should be in place for any project that is not a personal hobby. If they miss (quite often, I have to say…) the project can still be useful and cool, but it is not being managed properly from QA point of view.

Bug processing:

I am not able to write a detailed list here. I think that this part depends on project characteristics: what the program does, if it is led by a company or not, how active the community is in development, and so on. Still, common sense should apply. Users are affected by bugs, therefore bugs should be fixed when it is reasonable to do so.

When it comes to open source RDBMSs…

Yes, I’m only talking about relational databases. And I’m only consider the most widely known ones. With this, I don’t want to disrespect any project. There can be great reasons to use NoSQL or less known SQL databases, but NoSQL is a huge set of different products, and I only want to write about what I know (or “know”).

MySQL

MySQL has a public bug tracker, which is good. But they also have a private one, and this has some bad consequences:

  • If you read about a bug that is still active, you cannot be sure that information is up to date.
  • Some bugs are private, for security reasons. Those reasons are often quite obscure, if not always. Note that MySQL typically runs on open source layers whose security bugs are public.
  • For the same reasons, not all tests are public. This reduces community QA capabilities.

But you can read more detailed information about this topic in Blog of (former?) MySQL Entomologist.

Another flaw is that not all MySQL components have the same level of quality. If you make heavy use of certain features (XA transactions, stored procedures…) you can reasonably expect to hit bugs. This is partly understandable, because Oracle needs to improve the features that are most used and useful. What is much less understandable is how an area of the product that was disappointing and buggy many years ago is still disappointing and buggy.

About the bug tracker… I’d like to highlight something that is rarely mentioned: its search capabilities are powerful. Maybe not so user friendly – you need to know how to use the tool. But once you know, you will be able to find everything you need.

Percona Server

Percona Server is a fork of MySQL which tries to keep the differences as small as possible. Thus, generally MySQL bugs are also Percona Server bugs, and all the features they implement could cause some specific bugs.

If they fix MySQL bugs, you will find this in their documentation.

Specific bugs can be found in Percona Server JIRA.

You can usually find internal details and tests. You know who works on a bug and when. Generally you don’t know in advance which version will contain a fix.

I think they do very well, and no, the fact I worked for Percona does not affect my opinion. Keep in mind, however, that Percona is a customer-centric company. This means that generally they fix bugs that affect their customers. And, obviously, they merge bug fixes that happen in mainstream MySQL code.

MariaDB

They also have a JIRA to report and search bugs. In their JIRA you can also find tests and several technical information written by the developers. It happened to me more than once to read more information about a bug and find interesting details about InnoDB internals.

In MariaDB JIRA you can also find some information that you can’t find in Percona JIRA. You can see the active sprint’s activity, so you know which bugs they are working on and what their status is (todo, in progress, in review, done). And you can see the next releases, including the next planned release dates (keep in mind that they are not commitments). This is a very high level of transparency and potentially a valuable source of information, for example if you are planning upgrades or if you have pressing problems with a specific bug. And of course it’s still interesting if you are just curious.

I also want to highlight that they discuss in the same way contributions and new features. Also, developers often interact with the community using their mailing lists, which makes the whole project much more transparent.

PostgreSQL

PostgreSQL does not have a bug tracker. Bugs can be reported to the dedicated mailing list or via a web form which sends a message to the mailing list. The archives are the only available tool to see information about Postgres bugs. In other words, there is no way to see the list of open bugs. This is sad.

However, Percona recently started to offer support for PostgreSQL. Their Careers page indicated that they were looking for someone able to work with Postgres code, some time ago (yes, I keep an eye on some companies career pages, because they are a source of information about their plans). Therefore my hope is that they intend to do a great QA work with Postgres like they always did with MySQL, and that they will push to have a publicly available professional bug tracker.

CockroachDB

CockroachDB uses GitHub issues to track bugs. The tool is quite poor compared to JIRA or bugs.mysql.com, but it has the most important features that you can reasonably expect.

Usually there are not many technical details about bugs. And you don’t know what is happening around a bug, unless there is a new comment.

Why do I always criticise?

Actually I don’t. But I’m aware that, if a person expresses 3 positive thoughs and 1 criticism, in many cases only the latter will be remembered. This is natural.

Anyway, my spirit is always: please do better. I use my time to criticise what I like, not what I dislike.

My criticism could also be wrong, in which case there could at least be a good discussion in the comments.

Federico

Hidden caches catch your data

This article is different from my usual posts. It explains things that may be obvious to many database professionals – not all of them though.

The idea came indirectly from my friend Francesco Allertsen. He has a weekly mailing list he uses to share links to interesting articles he reads on the web. One of them was The hidden components of Web caching. Its purpose is to list all caches that play some role when we interact with a web site. An interesting idea, even if I find it incomplete. So I thought it was a good idea to talk about caches that we hit whenever we interact with a database.

Why should we care?

But first, a note on why we should care:

  • Caches increase the speed of IO by orders of magnitude. Latency numbers that everyone should know gives you an idea of that. The first comment suggests to use a solar system image to visualise the scale; curiously I had the same idea, and I used this image for some slides I made in the past.
  • When reliability is important, caches can get in the way in a dangerous way. Because caches are volatile, so they won’t survive a crash or other types of hardware/software failure (aka bugs and limitations). So when we write data, for certain use cases (financial transactions, etc) at no time data should be cached but not written to disk. Or it can happen temporarily before the database says “Ok Mr. Application, I got your data, you can resume your job”.

What guarantees this requirement? Well, caches can be write-through, write-around or write-back. Write-through and write-around caches are reliable in this respect, because data is always written to disk before the control is returned to the writer. Write-back are not reliable, because data is made persistent asynchronously, after the control has been returned. But of course they are much faster because they allow to groups more IO operations together (+ throughput), because latency is very low and because there is no stall if the IO capacity is currently saturated.

It’s all about tradeoffs

Depending on the use case, we will have to choice the proper tradeoff between reliability and performance. For example, InnoDB allows to:

  • Flush changes to disks at every commit: even if mysqld crashes, no data loss is possible if you don’t hit any software/hardware bug;
  • Flush changes to the filesystem cache at every commit: a filesystem crash is the only event that can cause data loss, and it is not likely;
  • Flush data to disk once a second, or even longer intervals.

Also, when we make the choice, we should keep into account data redundancy. For example, if we run a Galera Cluster, we have at least 3 copies of the data on different servers. Or we could store data on a RAID array, which also guarantees that we have multiple copies of data. Failover guarantees that our services don’t break if one copy gets damaged, and we can restore it from another copy. In such cases, even if we deal with critical data, we don’t necessarily need InnoDB to store data in the most reliable way – which is the slowest.

Types of caches

Enough blah blah. Here is the list of caches that could get in the way when we try to persist our data, starting from the lowest levels:

Disk caches – Both spinning disks and SSD can have caches. In your laptop, this cache is most probably write-back. This can usually be changed with something like hdparm.

Drive controllers and RAIDs caches – These can be write-through or write-back, and usually they are configurable. Notice that they could also be battery-backed, which means that they will survive a crash (unless the device has no power for a long time). Battery-backed caches can safely be write-back, but for other caches a write-through strategy could be necessary. Battery-backed RAIDs need a periodic learning cycle. A learning cycle slows down all operations sensibly, but it is necessary to be sure that the battery is totally charged. This operation should be scheduled carefully.

Filesystem cache – You can use it in a write-through or write-back fashion. This topic is amazingly complex (just like the previous ones), so I’ll just give you a link to a wonderful article: Files are Hard.

Virtual machines – Virtual machines have a disk interface cache. Its write strategy depends on the cache mode. There are several cache modes, but here we’ll only mention the reliable ones: none, which means that the VM doesn’t cache data (but the host system can), and writethrough, whose meaning should now be clear. Virtual machines also have a filesystem cache, of course. Note that having reliable settings on the VM guarantees that data changes will survive if the VM or anything running in it will crash; but if the host doesn’t have reliable settings and it crashes, most recent changes could be lost. Still, in production, typically a hypervisor runs many VMs. If many VMs bypass the cache, hypervisor’s IO capacity can be easily saturated. It could be better to build a database cluster with VMs running on separate physical hosts, to reduce the risk of data loss in case of crashes – in other words, often it’s better to rely on redundancy and failover, rather than reliability of individual systems.

Transaction log buffer – Terminology differ from DBMS to DBMS (WAL, transaction logs…), but the idea is that changes are persistent when they hit these logs. They will also need to be written to data files, but if they are in the logs they are safe. These logs have a buffer, which contains data not yet flushed. This idea can be confusing for some, so I’ll make it clear: this speeds up things, but doesn’t cause any danger. If your flush strategy is a write-through one, the buffer will contain not yet committed changes, which are flushed on commit – and only after the flush, the DBMS will report success. Regardless your flush strategy, some changes are flushed if the buffer gets full.

Binary log buffer – There is not necessarily a binary log separated from the transaction logs. MySQL has it because its architecture requires it – binary log contains all changes to data and it’s handled by the server, transaction logs contain information necessary to replay or rollback transactions and it’s handled by InnoDB (actually even non-transactional storage engines can have logs, but I’m not going to discuss this here). Considerations about the transaction logs apply to the binary log as well, keeping in mind that its purpose is different (incremental backups and replication, not crash recovery). In Postgres you have WAL files, which are used both for incremental backups/replication and for crash recovery.

Buffer pool – Most databases (Postgres is a famous exception) have a buffer pool to cache frequently accessed data and indexes. It can even contain dirty pages: changes that are not yet written to data files. This makes things much faster. And again: changes are persistent when they are written to transaction logs. Even after a crash, data files can be repaired using transaction logs.

Session buffers, work mem – These buffers speed up parts of query execution, like joins and sorting. However they have nothing to do with writes.

Query cache – MySQL older versions, MariaDB, and maybe other DBMS’s (not sure, sorry) have a query cache. This can speed up reads when the very same query is ran often. “Very same” means that hashes of the queries are compared, so any difference is relevant, including whitespaces. Every time a table is written, all queries mentioning the table are invalidated in the cache. This and its well-known scalability problems make it usually a bad idea, at least in MariaDB/MySQL (there are exceptions – for example, if you have small concurrency, a reasonable number of very slow queries and not many writes).

Proxy caches – Proxies, like ProxySQL, can also have a query cache. It can have problems and it is not necessarily useful, but at least it is supposed to be built with scalability in mind (because proxies are about scalability).

Redis & friends – This should be obvious: retrieving data from a cache system (like Redis or Memcached) is much faster than retrieving it from MySQL. Usually those data have a TTL (time to live), which determines when they will expire, and they can also be invalidated manually. Keep in mind that this makes response times unpredictable: if data is cached response time is X, if it is expired time is Y – where X and Y could be very different. It is even more unpredictable if this cache is not enough to contain all your hot data. So you should be careful about what to cache, unless your dataset is small. Note that these caches could also use the disks: for example, older Redis versions had Virtual Memory (currently deprecated). But we will not dig into this, as our focus is the persistent database. The point is: these caches can avoid database queries, but not always.

Application – No matter how fast proxy’s query cache, Redis and Memcached are: retrieving data from local RAM is much faster. No network round trip, no other servers response time involved. Of course you shouldn’t cache locally a big amount of data, or your memory will not be enough and your application could suffer. And cache invalidation can be a very complex problem to consider. But still, for hottest small data, local memory is the fastest option. To avoid making the response time unpredictable, it’s better to keep application-level caches updated, instead of running queries when an entry expires. Writes to the database are still necessary, and they can be synchronous or asynchronous, depending on how critical these data are.

Trust no one

A famous fairy said that some lies have short legs and others have long a nose. If hard disks, controllers and even filesystems had noses, some of them would have a long nose.

I will not dig into this complex topic myself, but the take away of this paragraph is: don’t trust them. They sometimes lie about consistency, so benchmarks are more impressive and marketing people are happy. Instead, try diskchecker.pl. It will tell you if something in your system is lying. It will not tell you if it is the hard disk, or the controller, or something in the OS. But it will tell you if data it writes are actually persisted immediately.

If your data are on the cloud, you cannot use this tool – because it involves shutting down the physical server suddenly while a file is being written. I am not aware of any tool or procedure to check if your cloud provider is lying about persistence. If you know one, please write a comment to this post. That would be much appreciated.

Databases don’t lie – at least, I am not aware of any DBMS or object store lying about persistence. But they have bugs, just like any piece of software, so you should check them periodically. Here is a PostgreSQL example.

Federico

MySQL vs. MariaDB: WAIT, NOWAIT, SKIP LOCKED

NOWAIT, WAIT and SKIP LOCKED are syntaxes added in MySQL 8.0 and MariaDB 10.3. The idea came from AliSQL (MySQL fork by Alibaba). It was revisited in MySQL, and I am not aware if MariaDB used the original implementation. EDIT: As Morgan Tocker points out in a comment, originally Ali Baba filed a feature request to MySQL.

While MySQL and MariaDB syntaxes are similar, there are important differences and the compatibility is only apparent. This article discusses these differences.

WAIT

This syntax is only available in MariaDB. It means that, if a row or table that we want to read is write-locked, we can wait up to the specified number of seconds. If the lock is not released after the timeout occurs, the query will fail.

NOWAIT

If a table or row we need to read is write-locked, the query will not be queued; instead, it will fail immediately.

Incompatibilities:

  • MariaDB supports this syntax for some DDL statements (ALTER TABLE and its shortcuts), LOCK TABLES, and SELECT. MySQL only supports it for SELECT.
  • MySQL only supports this syntax in combination with FOR UPDATE or FOR SHARE. In order to introduce an incompatibility, they sacrificed the support of this feature for SELECTs in SERIALIZABLE mode, that have an implicit LOCK IN SHARE MODE clause. Fortunately this is an edge case, but it is another case where Oracle marketing strategies affect users in a bad way.
  • MySQL implements FOR UPDATE OF and FOR SHARE OF. This is interesting, and not only for the NOWAIT feature, because it allows us to JOIN multiple tables without locking them all. Thanks, Oracle engineers.
  • MySQL and MariaDB report different error codes and messages.
    MySQL says: ERROR 3572 (HY000): Statement aborted because lock(s) could not be acquired immediately and NOWAIT is set
    MariaDB says: ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

SKIP LOCKED

This is only in MySQL. It excludes locked rows from the resultset, making our queries fast. The documentation warns us that the resultset will be inconsistent. This is implicit in the future, but it is worth emphatizing. However, consistency is not always so important, and skipping rows seems to me a great way to solve some performance problems.

EDIT: Morgan’s comment points out that SKIP LOCKED is also non-deterministic. Again, I believe this is clear if you understand what this feature does, but still, maybe I should point it out. I think it could be compare to READ UNCOMMITTED: they are different optimizations, but in both cases the results you get depend on what other connections are doing. So results are inconsistent and not deterministic by nature.

My conclusions

MariaDB implementation of NOWAIT is clearly more complete. They have WAIT syntax to set a timeout; they implemented this syntax in more SQL statements; and the syntax is compatible with implicit LOCK IN SHARE MODE.

Despite this, IMHO, MySQL wins. They have SKIP LOCKED, which is very interesting. Also the above mentioned FOR UPDATE OF syntax is a nice feature.

Once again, Oracle spent some resources to add incompatibilities. This does not affect the quality of MySQL, but it’s still a damage to the community, which should be able to use both MySQL and MariaDB in the same environments, with the same tools and the same libraries. But the distance between these DBMS’s is growing constantly.

References

Federico