Hidden caches catch your data

This article is different from my usual posts. It explains things that may be obvious to many database professionals – not all of them though.

The idea came indirectly from my friend Francesco Allertsen. He has a weekly mailing list he uses to share links to interesting articles he reads on the web. One of them was The hidden components of Web caching. Its purpose is to list all caches that play some role when we interact with a web site. An interesting idea, even if I find it incomplete. So I thought it was a good idea to talk about caches that we hit whenever we interact with a database.

Why should we care?

But first, a note on why we should care:

  • Caches increase the speed of IO by orders of magnitude. Latency numbers that everyone should know gives you an idea of that. The first comment suggests to use a solar system image to visualise the scale; curiously I had the same idea, and I used this image for some slides I made in the past.
  • When reliability is important, caches can get in the way in a dangerous way. Because caches are volatile, so they won’t survive a crash or other types of hardware/software failure (aka bugs and limitations). So when we write data, for certain use cases (financial transactions, etc) at no time data should be cached but not written to disk. Or it can happen temporarily before the database says “Ok Mr. Application, I got your data, you can resume your job”.

What guarantees this requirement? Well, caches can be write-through, write-around or write-back. Write-through and write-around caches are reliable in this respect, because data is always written to disk before the control is returned to the writer. Write-back are not reliable, because data is made persistent asynchronously, after the control has been returned. But of course they are much faster because they allow to groups more IO operations together (+ throughput), because latency is very low and because there is no stall if the IO capacity is currently saturated.

It’s all about tradeoffs

Depending on the use case, we will have to choice the proper tradeoff between reliability and performance. For example, InnoDB allows to:

  • Flush changes to disks at every commit: even if mysqld crashes, no data loss is possible if you don’t hit any software/hardware bug;
  • Flush changes to the filesystem cache at every commit: a filesystem crash is the only event that can cause data loss, and it is not likely;
  • Flush data to disk once a second, or even longer intervals.

Also, when we make the choice, we should keep into account data redundancy. For example, if we run a Galera Cluster, we have at least 3 copies of the data on different servers. Or we could store data on a RAID array, which also guarantees that we have multiple copies of data. Failover guarantees that our services don’t break if one copy gets damaged, and we can restore it from another copy. In such cases, even if we deal with critical data, we don’t necessarily need InnoDB to store data in the most reliable way – which is the slowest.

Types of caches

Enough blah blah. Here is the list of caches that could get in the way when we try to persist our data, starting from the lowest levels:

Disk caches – Both spinning disks and SSD can have caches. In your laptop, this cache is most probably write-back. This can usually be changed with something like hdparm.

Drive controllers and RAIDs caches – These can be write-through or write-back, and usually they are configurable. Notice that they could also be battery-backed, which means that they will survive a crash (unless the device has no power for a long time). Battery-backed caches can safely be write-back, but for other caches a write-through strategy could be necessary. Battery-backed RAIDs need a periodic learning cycle. A learning cycle slows down all operations sensibly, but it is necessary to be sure that the battery is totally charged. This operation should be scheduled carefully.

Filesystem cache – You can use it in a write-through or write-back fashion. This topic is amazingly complex (just like the previous ones), so I’ll just give you a link to a wonderful article: Files are Hard.

Virtual machines – Virtual machines have a disk interface cache. Its write strategy depends on the cache mode. There are several cache modes, but here we’ll only mention the reliable ones: none, which means that the VM doesn’t cache data (but the host system can), and writethrough, whose meaning should now be clear. Virtual machines also have a filesystem cache, of course. Note that having reliable settings on the VM guarantees that data changes will survive if the VM or anything running in it will crash; but if the host doesn’t have reliable settings and it crashes, most recent changes could be lost. Still, in production, typically a hypervisor runs many VMs. If many VMs bypass the cache, hypervisor’s IO capacity can be easily saturated. It could be better to build a database cluster with VMs running on separate physical hosts, to reduce the risk of data loss in case of crashes – in other words, often it’s better to rely on redundancy and failover, rather than reliability of individual systems.

Transaction log buffer – Terminology differ from DBMS to DBMS (WAL, transaction logs…), but the idea is that changes are persistent when they hit these logs. They will also need to be written to data files, but if they are in the logs they are safe. These logs have a buffer, which contains data not yet flushed. This idea can be confusing for some, so I’ll make it clear: this speeds up things, but doesn’t cause any danger. If your flush strategy is a write-through one, the buffer will contain not yet committed changes, which are flushed on commit – and only after the flush, the DBMS will report success. Regardless your flush strategy, some changes are flushed if the buffer gets full.

Binary log buffer – There is not necessarily a binary log separated from the transaction logs. MySQL has it because its architecture requires it – binary log contains all changes to data and it’s handled by the server, transaction logs contain information necessary to replay or rollback transactions and it’s handled by InnoDB (actually even non-transactional storage engines can have logs, but I’m not going to discuss this here). Considerations about the transaction logs apply to the binary log as well, keeping in mind that its purpose is different (incremental backups and replication, not crash recovery). In Postgres you have WAL files, which are used both for incremental backups/replication and for crash recovery.

Buffer pool – Most databases (Postgres is a famous exception) have a buffer pool to cache frequently accessed data and indexes. It can even contain dirty pages: changes that are not yet written to data files. This makes things much faster. And again: changes are persistent when they are written to transaction logs. Even after a crash, data files can be repaired using transaction logs.

Session buffers, work mem – These buffers speed up parts of query execution, like joins and sorting. However they have nothing to do with writes.

Query cache – MySQL older versions, MariaDB, and maybe other DBMS’s (not sure, sorry) have a query cache. This can speed up reads when the very same query is ran often. “Very same” means that hashes of the queries are compared, so any difference is relevant, including whitespaces. Every time a table is written, all queries mentioning the table are invalidated in the cache. This and its well-known scalability problems make it usually a bad idea, at least in MariaDB/MySQL (there are exceptions – for example, if you have small concurrency, a reasonable number of very slow queries and not many writes).

Proxy caches – Proxies, like ProxySQL, can also have a query cache. It can have problems and it is not necessarily useful, but at least it is supposed to be built with scalability in mind (because proxies are about scalability).

Redis & friends – This should be obvious: retrieving data from a cache system (like Redis or Memcached) is much faster than retrieving it from MySQL. Usually those data have a TTL (time to live), which determines when they will expire, and they can also be invalidated manually. Keep in mind that this makes response times unpredictable: if data is cached response time is X, if it is expired time is Y – where X and Y could be very different. It is even more unpredictable if this cache is not enough to contain all your hot data. So you should be careful about what to cache, unless your dataset is small. Note that these caches could also use the disks: for example, older Redis versions had Virtual Memory (currently deprecated). But we will not dig into this, as our focus is the persistent database. The point is: these caches can avoid database queries, but not always.

Application – No matter how fast proxy’s query cache, Redis and Memcached are: retrieving data from local RAM is much faster. No network round trip, no other servers response time involved. Of course you shouldn’t cache locally a big amount of data, or your memory will not be enough and your application could suffer. And cache invalidation can be a very complex problem to consider. But still, for hottest small data, local memory is the fastest option. To avoid making the response time unpredictable, it’s better to keep application-level caches updated, instead of running queries when an entry expires. Writes to the database are still necessary, and they can be synchronous or asynchronous, depending on how critical these data are.

Trust no one

A famous fairy said that some lies have short legs and others have long a nose. If hard disks, controllers and even filesystems had noses, some of them would have a long nose.

I will not dig into this complex topic myself, but the take away of this paragraph is: don’t trust them. They sometimes lie about consistency, so benchmarks are more impressive and marketing people are happy. Instead, try diskchecker.pl. It will tell you if something in your system is lying. It will not tell you if it is the hard disk, or the controller, or something in the OS. But it will tell you if data it writes are actually persisted immediately.

If your data are on the cloud, you cannot use this tool – because it involves shutting down the physical server suddenly while a file is being written. I am not aware of any tool or procedure to check if your cloud provider is lying about persistence. If you know one, please write a comment to this post. That would be much appreciated.

Databases don’t lie – at least, I am not aware of any DBMS or object store lying about persistence. But they have bugs, just like any piece of software, so you should check them periodically. Here is a PostgreSQL example.

Federico

Advertisements

MySQL vs. MariaDB: WAIT, NOWAIT, SKIP LOCKED

NOWAIT, WAIT and SKIP LOCKED are syntaxes added in MySQL 8.0 and MariaDB 10.3. The idea came from AliSQL (MySQL fork by Alibaba). It was revisited in MySQL, and I am not aware if MariaDB used the original implementation. EDIT: As Morgan Tocker points out in a comment, originally Ali Baba filed a feature request to MySQL.

While MySQL and MariaDB syntaxes are similar, there are important differences and the compatibility is only apparent. This article discusses these differences.

WAIT

This syntax is only available in MariaDB. It means that, if a row or table that we want to read is write-locked, we can wait up to the specified number of seconds. If the lock is not released after the timeout occurs, the query will fail.

NOWAIT

If a table or row we need to read is write-locked, the query will not be queued; instead, it will fail immediately.

Incompatibilities:

  • MariaDB supports this syntax for some DDL statements (ALTER TABLE and its shortcuts), LOCK TABLES, and SELECT. MySQL only supports it for SELECT.
  • MySQL only supports this syntax in combination with FOR UPDATE or FOR SHARE. In order to introduce an incompatibility, they sacrificed the support of this feature for SELECTs in SERIALIZABLE mode, that have an implicit LOCK IN SHARE MODE clause. Fortunately this is an edge case, but it is another case where Oracle marketing strategies affect users in a bad way.
  • MySQL implements FOR UPDATE OF and FOR SHARE OF. This is interesting, and not only for the NOWAIT feature, because it allows us to JOIN multiple tables without locking them all. Thanks, Oracle engineers.
  • MySQL and MariaDB report different error codes and messages.
    MySQL says: ERROR 3572 (HY000): Statement aborted because lock(s) could not be acquired immediately and NOWAIT is set
    MariaDB says: ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

SKIP LOCKED

This is only in MySQL. It excludes locked rows from the resultset, making our queries fast. The documentation warns us that the resultset will be inconsistent. This is implicit in the future, but it is worth emphatizing. However, consistency is not always so important, and skipping rows seems to me a great way to solve some performance problems.

EDIT: Morgan’s comment points out that SKIP LOCKED is also non-deterministic. Again, I believe this is clear if you understand what this feature does, but still, maybe I should point it out. I think it could be compare to READ UNCOMMITTED: they are different optimizations, but in both cases the results you get depend on what other connections are doing. So results are inconsistent and not deterministic by nature.

My conclusions

MariaDB implementation of NOWAIT is clearly more complete. They have WAIT syntax to set a timeout; they implemented this syntax in more SQL statements; and the syntax is compatible with implicit LOCK IN SHARE MODE.

Despite this, IMHO, MySQL wins. They have SKIP LOCKED, which is very interesting. Also the above mentioned FOR UPDATE OF syntax is a nice feature.

Once again, Oracle spent some resources to add incompatibilities. This does not affect the quality of MySQL, but it’s still a damage to the community, which should be able to use both MySQL and MariaDB in the same environments, with the same tools and the same libraries. But the distance between these DBMS’s is growing constantly.

References

Federico

My 2018 Databases Wishlist

Well, the most important wishes I have for 2018 are a bit out of topic for this blog: forms of organisation without a formal authority, schools not teaching religions, and so on. But in this post, I will write about databases… as usual.

So, here is my whishlist, for what it matters.

More research on Learned Indexes

If you don’t know what I’m talking about, see this paper. Having a data structure faster than B-Trees is exciting. But of course I’d like to see also considerations on write performance.

Progress on using ML for database tuning

See this article. I don’t think that Machine Learning will ever be able to replace (good) DBAs, but having a tool which suggests tuning based on real workload sounds great. It can be a valid help for DBAs.

More research on stored functions transformation

Stored functions are useful but slow. But see this paper. It seems it is possible to transform imperative programs to queries, improving the complexity by some orders of magnitude.

On a side note, MariaDB implemented a lot of syntax from Oracle for stored procedures. While this sounds like a precise commercial strategy, the technical improvement on this area is great. Still, what I’d like to see is better performance, as well as support for external languages.

Galera 4

Let me be clear, I didn’t read any announcement that Galera 4 will be released this year. But they announced exciting news over time, and still the new version isn’t here. At some point, it should be released (hopefully).

Transactional DDL in the MySQL ecosystem

MySQL 8.0 has support for atomic DDL statements. They did it in a good way: it’s engine independent and, while it uses InnoDB information_schema tables, any engine is free to add support for this feature. They stated that this is the basis for transactional DDL, but we are not yet there. MariaDB has a task for transactional DDL.

EDIT: Thanks to Valerii Kravchuk for pointing me MDEV-11424 – Instant ALTER TABLE of failure-free record format changes. It is clearly worth adding it to my wishlist: please Maria, get it done!

Engines, engines, engines

RocksDB is great, please consolidate it. TokuDB can improve in many ways, please don’t stop investing on it. Next version of SPIDER will be in MariaDB 10.3, I hope that the development will be a bit more active in the future.

Don’t kill MyISAM. It is still useful in some cases. For Catawiki use cases, I find it better than InnoDB for temporary tables. Also JFG has a great use case example.

More progress on Tarantool and CockroachDB

Tarantool is a great database, originally NoSQL. It is extremely scriptable (actually it can be seen as a Lua application server) and its modules allow to read and write data from a wide variety of data sources, including MySQL replication. Recently, SQL support has been added.

CockroachDB is an open source RDBMS design to scale geographically. It uses distributed transaction. It also allows to tune the redundancy of data at table level and define replication zones.

Great conferences

I will be both at M18 (I’m not sponsored by my company, but I chosen to go anyway) and Percona Live. At M18 I will give a talk titled Somewhere between schema and schemaless. Of course I also submitted proposal for Percona Live, let’s see if they get accepted.

Random thoughts after Percona Live

Percona Live Europe 2017 has ended. As always it’s been a great opportunity to hear great technical talks, ask questions, get in touch with people, drink beer, collect nice & useless gadgets. And, on a personal note, to meet former colleagues and former customers, some of which I never met before. And even joke with a customer about my sleepy voice when I had to start all my Monday mornings with a Skype call with him, at 8, at home. I revealed him I tried everything, including singing, but my voice didn’t improve. Oh, well, if he messaged me after I left Percona, I probably did a decent job.

Here are some completely random thoughts in random order.

  • MariaDB wasn’t there. There was a single talk, officially from the Foundation (I can’t really distinguish the two entities, and I’m not alone). They have their own conference, M18 will be the second edition. So most people have to choose if to attend Percona or Maria. The only consequence I care about is that they’re splitting the community and making every part weaker. As I wrote for the BSL time ago, I can only hope they will change their mind.
  • Tarantool wasn’t there. So bad, I believe in it.
  • Technologies that are growing fast or strongly promoted: ClickHouse, ProxySQL, Vitess, MyRocks storage engine. Wow.
    • But I’ve attended a very nice talk on a TokuDB real life story. Don’t underestimate it.
  • At Percona Live talks, you often hear questions about performance and quality. With MariaDB talk, I was the only one to ask these things (and actually my questions were almost a DoS attack, sorry Vicentiu). I’m not saying that MariaDB has not quality, I’m saying that users have probably a different approach.
  • Sharding is becoming a hot topic.
  • Open source philosophy is consolidating and expanding. It goes far beyond licenses and marketing claims, it is the attitude and pleasure to collaborate and do things that, without a community, wouldn’t be as strong.
  • As remarked by PZ, the first criteria for choosing a technology is usability. There are very good reasons for this, given the current complexity of big or even medium infrastructures. But it’s still a bit dangerous.
  • If you don’t have a reason to be there next year, I’ll give you one: Irish red beer.

Enjoy.
Federico

FOREACH in MySQL/MariaDB stored procedures

One of the annoying limitations of MySQL/MariaDB stored procedures is the lack of a FOREACH construct, which loops on each row returned by a query.

In practice, this forces users to write a lot of code just to tell MySQL how to fetch rows and exit properly. Nesting 2 loops of this kind simply results in unmaintenable code (don’t trust me, just try).

Now, I’m writing a library of views and procedures that I’ll share as open source in the next days, and I decided to finally write my foreach. Well, sort of. It is impossible to use the current stored procedures language to write a flexible foreach, because to loop rows you need a cursor. And cursors are based on a hard-coded query. In this old post I proposed a verbose, ugly, but working solution, but it has a limitation: the number of columns returned by the query must still be fixed. So, I used that technique (based on views) in my procedure, but I had to write different procedures: foreach_1(), foreach_2(), foreach_3(). If you need to read more rows you can modify the code easily, but I think that for an open source library 3 columns is reasonable.

Here I decided to share an experimental prototype. If you have a better idea on how to achieve a similar result, I’ll be glad to trash this hack and use your idea instead.

Also note that for this procedure I used MariaDB 10.2 improved PREPARE statement. If you want to run it on MySQL or older MariaDB versions, you’ll have to make some easy changes and test the procedure.

The code

CREATE PROCEDURE foreach_2(IN in_sql TEXT, IN in_callback_type VARCHAR(9), IN in_callback_body TEXT)
    NOT DETERMINISTIC
    READS SQL DATA
BEGIN
    DECLARE p_sql TEXT DEFAULT NULL;
    DECLARE p_eof BOOL DEFAULT FALSE;
    DECLARE v_p1, v_p2 TEXT DEFAULT NULL;
 
    DECLARE crs_foreach CURSOR FOR
    SELECT p1, p2 FROM vw_foreach;

    DECLARE CONTINUE HANDLER
        FOR NOT FOUND
    BEGIN
        SET p_eof := TRUE;
    END;

    SET in_callback_type := UPPER(in_callback_type);
     IF NOT (in_callback_type IN ('SQL', 'PROCEDURE')) THEN
         SIGNAL SQLSTATE VALUE '45000'
        SET MESSAGE_TEXT = '[foreach_2] Invalid in_callback_type';
    END IF;

    DO GET_LOCK('crs_foreach', 1);

    SET p_sql := CONCAT('CREATE OR REPLACE VIEW vw_foreach AS ', in_sql, ';');
    PREPARE stmt_foreach_2 FROM p_sql;
    EXECUTE stmt_foreach_2;

    OPEN crs_foreach;

    DO RELEASE_LOCK('crs_foreach');
 
    lp_while: WHILE TRUE DO
        FETCH crs_foreach INTO v_p1, v_p2;
 
        IF p_eof THEN
            LEAVE lp_while;
        END IF;

        IF in_callback_type IN ('SQL') THEN
            SET @p1 := v_p1;
            SET @p2 := v_p2;
            PREPARE stmt_foreach_2 FROM in_callback_body;
        ELSE
            PREPARE stmt_foreach_2 FROM CONCAT('CALL `', in_callback_body, '`(', QUOTE(v_p1), ', ', QUOTE(v_p2), ');');
        END IF;
        EXECUTE stmt_foreach_2;
    END WHILE;

    CLOSE crs_foreach;
    DEALLOCATE PREPARE stmt_foreach_2;
END

Now, the usage. There are 2 ways to use this procedure.

Using a callback procedure

First, create a callback procedure which will handle the rows returned by your query:

CREATE PROCEDURE p_echo(IN in_p1 TEXT, IN in_p2 TEXT)
BEGIN
    SELECT CONCAT_WS('.', in_p1, in_p2) AS r;
END

As you can see, the procedure must accept 2 values (foreach_2). The names don’t really matter, but p1 and p2 are clear, at least in this example.

Now, invoke foreach_2():

CALL foreach_2(
    'SELECT TABLE_SCHEMA AS p1, TABLE_NAME AS p2 FROM information_schema.TABLES LIMIT 3;',
    'PROCEDURE',
    'p_echo'
);

We passed the query producing the results we want to loop. Then we told foeach_2() that we want to use a callback procedure, and specified its name.

Using a callback SQL statement

Writing a procedure to perform a simple tasks would be annoying. That’s why I decided to also support callback statements:

CALL foreach_2(
    'SELECT TABLE_SCHEMA AS p1, TABLE_NAME AS p2 FROM information_schema.TABLES LIMIT 3;',
    'SQL',
    'SELECT @p1 AS p1, @p2 AS p2;'
);

As you can see, the callback query can access the values using user variables: @p1 and @p2.

These variables exist at session level, so it is possible that we are overwriting something. But I think there are no reasonable ways to avoid these collisions.

Enjoy!

Prepared Statements in MariaDB 10.2

Prepared statements are particularly useful in stored procedures. In fact, they are the only way to execute dynamic SQL, which means that, for example, the list of ORDER BY columns is in a variable. You can do this by composing an SQL string, as you do in other languages.

However, in MySQL and up to MariaDB 10.1, there are some annoying limitations:

  • Prepared statements exist at a connection level. Even if you declare them in a stored procedure, they are global. If you are writing a stored procedure which should be usable in many contexts (maybe it’s part of an open source library), there is a risk that you overwrite existing prepared statements. There is no way to get a list of existing prepared statements.
  • A prepared statement is prepared from a literal string or a user variable. A literal string cannot contain variables. A user variable exists at a session level, so again, there is a risk to overwrite something. Especially if you always call the variable @sql, because a lot of people tends to do so.
  • Prepared statements are verbose. You need to run 3 different statements (PREPARE, EXECUTE, DEALLOCATE PREPARE) just to execute a dynamic SQL statement.

First, let me say that the first problem is not solved. I hope it will be solved in a future version, so it will be easier to distribute stored procedures libraries. After all, MariaDB 10.3 will have several improvements to stored procedures.

The second problem is solved in 10.2. You can pass any SQL expression (or, at least, all expressions I tested) to PREPARE. Including a local variable, which means that there will be no conflicts with variables declared by the user. For example:

DECLARE v_query TEXT DEFAULT 'SELECT COUNT(DISTINCT user) FROM mysql.user';
PREPARE stmt FROM v_query;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

But this example is verbose. It demonstrates what I stated above: you need at least 3 statements just to execute a dynamic query. But MariaDB 10.2 solves this problem introducing EXECUTE IMMEDIATE, which is basically a shortcut:

DECLARE v_query TEXT DEFAULT 'SELECT COUNT(DISTINCT user) FROM mysql.user';
EXECUTE IMMEDIATE v_query;

As you can see, we don’t need PREPARE or DEALLOCATE PREPARE if we use EXECUTE IMMEDIATE. This is nice to have in stored procedures, but is not useful outside of procedures, because the prepared statement is always reprepared, even if we executed it before.

Enjoy!

 

Thoughts on MaxScale new license

MaxScale has been open source until now, just like all MariaDB projects. But the 2.0 version is released under a new license called BSL, which basically makes the covered work non-free until the Change Date (in this case 2019-01-01), when the license will be converted to GPL.

Looks like open source friendly, after all. The license will be GPL, just be patient. And the code is available. Right?

No. Cmpletely wrong. For plenty of reasons.

Some reasons

It is a lock-in. No matter how many times Monty repeats that there is no lock-in, we have a brain. If you don’t allow anyone to fix bugs except for yourself, it is a lock-in. If you force your users to buy your support, they won’t buy your competitors support.

MariaDB business moves to a non-free product. Yes, 1.4 is free an this won’t change. And yes, when 3.0 will be out, 2.0 will be free. But why should they maintain a free version, if money comes from non-free versions? Monty says that open source religion doesn’t put bread on the table. I suppose that maintaining free branches also doesn’t put bread on the table.

I wasn’t able to find any official EOL date for any MaxScale version – if there is one, please comment below.

MariaDB moves innovation to the non-free world. New features are non-free. When they will be old, they will be free. Monty also stated that this is the correct way to make money for a lot of projects. And he seems to advice this model to start-ups that use his venture capital, OpenOcean. Suddenly, BSL seems to be the only way for projects to survive. Is he protecting others projects interests, or using them for his own marketing?

MariaDB accused Oracle several times. When Oracle implemented a couple features and only distributed them in a non-GPL edition (threadpool, PAM authentication), MariaDB told that they had the same features as open source. Which was great. Except that… now MySQL Router is open source, MaxScale 2.0 is not. Now Monty has several justifications for this. But I fail to understand why open core is evil and BSL is good.

I mentioned Monty too many times. Is this an attack against Monty? Definitely not, but all articles I could find express Monty’s opinion, not MariaDB Corporation or anyone else’s opinion. I cannot answer the silence.

What is the MariaDB Foundation?

MariaDB Corporation has the legal right to make MaxScale non-free. They own it. They sometimes call it MariaDB MaxScale. They can: they also own MariaDB trademark.

So, what’s the role of MariaDB Foundation?

They claim they safeguard MariaDB. They don’t mention the ecosystem, the community, or other tools. They don’t mention, of course, MaxScale. Which is quite strange: they claimed that their model is Apache Foundation, which supports an entire ecosystem in many ways, and owns the trademarks.

Also, the board of directors has 6 members. 3 are from MariaDB Foundation. In this situation, they cannot have an independent opinion on MariaDB Corporation actions.

A curious aspect is that they declare they follow Ubuntu Code of Conduct. Please read its last paragraph and drawn your own conclusions.

My position on MariaDB and MaxScale

I am still grateful to MariaDB Corporation for creating and maintaining MariaDB (and to some of their engineers for creating MySQL).

From a technical point of view, they have many interesting features that are not in MySQL. Some of them come from the community, for example the CONNECT engine and their implementation of encryption. And the reason is that MariaDB is very open to the community.

Which brings us to a less technical point of view: MariaDB openness. Their JIRA account allows us to see the bugs (including their current status…). You can also see who is working on what, when next versions will be released, and what they will have. The team is active on the mailing lists and IRC. The documentation is a wiki and the license is free.

I have been a MariaDB supporter for years. I wrote Mastering MariaDB and I am one of their Community Ambassadors chosen by Colin Charles (who recently left MariaDB). Will my position about MariaDB project change? I don’t know, it’s too early to answer. For sure, I won’t deny that its openness is amazing and should be a model for everyone. (And I hope this won’t change)

And my position about MaxScale has changed? Of course it did. I wouldn’t use it for personal projects. Of course I could provide support but, given the license change, it seems to me unlikely. There are free alternatives: ProxySQL, MySQL Router, HAProxy. PoxySQL is by far the most interesting, if you ask me.

My position has changed forever? The answer depends on another question: will MariaDB fix its big mistake? I have no logic reasons to be optimistic, but I still hope it will. In the past they have apparently been open to criticism. After a complain in this blog, they made MaxScale binaries freely available, and I wrote a thank you post. What I couldn’t know is that they were preparing to close MaxScale next versions.