Random thoughts after Percona Live

Percona Live Europe 2017 has ended. As always it’s been a great opportunity to hear great technical talks, ask questions, get in touch with people, drink beer, collect nice & useless gadgets. And, on a personal note, to meet former colleagues and former customers, some of which I never met before. And even joke with a customer about my sleepy voice when I had to start all my Monday mornings with a Skype call with him, at 8, at home. I revealed him I tried everything, including singing, but my voice didn’t improve. Oh, well, if he messaged me after I left Percona, I probably did a decent job.

Here are some completely random thoughts in random order.

  • MariaDB wasn’t there. There was a single talk, officially from the Foundation (I can’t really distinguish the two entities, and I’m not alone). They have their own conference, M18 will be the second edition. So most people have to choose if to attend Percona or Maria. The only consequence I care about is that they’re splitting the community and making every part weaker. As I wrote for the BSL time ago, I can only hope they will change their mind.
  • Tarantool wasn’t there. So bad, I believe in it.
  • Technologies that are growing fast or strongly promoted: ClickHouse, ProxySQL, Vitess, MyRocks storage engine. Wow.
    • But I’ve attended a very nice talk on a TokuDB real life story. Don’t underestimate it.
  • At Percona Live talks, you often hear questions about performance and quality. With MariaDB talk, I was the only one to ask these things (and actually my questions were almost a DoS attack, sorry Vicentiu). I’m not saying that MariaDB has not quality, I’m saying that users have probably a different approach.
  • Sharding is becoming a hot topic.
  • Open source philosophy is consolidating and expanding. It goes far beyond licenses and marketing claims, it is the attitude and pleasure to collaborate and do things that, without a community, wouldn’t be as strong.
  • As remarked by PZ, the first criteria for choosing a technology is usability. There are very good reasons for this, given the current complexity of big or even medium infrastructures. But it’s still a bit dangerous.
  • If you don’t have a reason to be there next year, I’ll give you one: Irish red beer.

Enjoy.
Federico

Advertisements

FOREACH in MySQL/MariaDB stored procedures

One of the annoying limitations of MySQL/MariaDB stored procedures is the lack of a FOREACH construct, which loops on each row returned by a query.

In practice, this forces users to write a lot of code just to tell MySQL how to fetch rows and exit properly. Nesting 2 loops of this kind simply results in unmaintenable code (don’t trust me, just try).

Now, I’m writing a library of views and procedures that I’ll share as open source in the next days, and I decided to finally write my foreach. Well, sort of. It is impossible to use the current stored procedures language to write a flexible foreach, because to loop rows you need a cursor. And cursors are based on a hard-coded query. In this old post I proposed a verbose, ugly, but working solution, but it has a limitation: the number of columns returned by the query must still be fixed. So, I used that technique (based on views) in my procedure, but I had to write different procedures: foreach_1(), foreach_2(), foreach_3(). If you need to read more rows you can modify the code easily, but I think that for an open source library 3 columns is reasonable.

Here I decided to share an experimental prototype. If you have a better idea on how to achieve a similar result, I’ll be glad to trash this hack and use your idea instead.

Also note that for this procedure I used MariaDB 10.2 improved PREPARE statement. If you want to run it on MySQL or older MariaDB versions, you’ll have to make some easy changes and test the procedure.

The code

CREATE PROCEDURE foreach_2(IN in_sql TEXT, IN in_callback_type VARCHAR(9), IN in_callback_body TEXT)
    NOT DETERMINISTIC
    READS SQL DATA
BEGIN
    DECLARE p_sql TEXT DEFAULT NULL;
    DECLARE p_eof BOOL DEFAULT FALSE;
    DECLARE v_p1, v_p2 TEXT DEFAULT NULL;
 
    DECLARE crs_foreach CURSOR FOR
    SELECT p1, p2 FROM vw_foreach;

    DECLARE CONTINUE HANDLER
        FOR NOT FOUND
    BEGIN
        SET p_eof := TRUE;
    END;

    SET in_callback_type := UPPER(in_callback_type);
     IF NOT (in_callback_type IN ('SQL', 'PROCEDURE')) THEN
         SIGNAL SQLSTATE VALUE '45000'
        SET MESSAGE_TEXT = '[foreach_2] Invalid in_callback_type';
    END IF;

    DO GET_LOCK('crs_foreach', 1);

    SET p_sql := CONCAT('CREATE OR REPLACE VIEW vw_foreach AS ', in_sql, ';');
    PREPARE stmt_foreach_2 FROM p_sql;
    EXECUTE stmt_foreach_2;

    OPEN crs_foreach;

    DO RELEASE_LOCK('crs_foreach');
 
    lp_while: WHILE TRUE DO
        FETCH crs_foreach INTO v_p1, v_p2;
 
        IF p_eof THEN
            LEAVE lp_while;
        END IF;

        IF in_callback_type IN ('SQL') THEN
            SET @p1 := v_p1;
            SET @p2 := v_p2;
            PREPARE stmt_foreach_2 FROM in_callback_body;
        ELSE
            PREPARE stmt_foreach_2 FROM CONCAT('CALL `', in_callback_body, '`(', QUOTE(v_p1), ', ', QUOTE(v_p2), ');');
        END IF;
        EXECUTE stmt_foreach_2;
    END WHILE;

    CLOSE crs_foreach;
    DEALLOCATE PREPARE stmt_foreach_2;
END

Now, the usage. There are 2 ways to use this procedure.

Using a callback procedure

First, create a callback procedure which will handle the rows returned by your query:

CREATE PROCEDURE p_echo(IN in_p1 TEXT, IN in_p2 TEXT)
BEGIN
    SELECT CONCAT_WS('.', in_p1, in_p2) AS r;
END

As you can see, the procedure must accept 2 values (foreach_2). The names don’t really matter, but p1 and p2 are clear, at least in this example.

Now, invoke foreach_2():

CALL foreach_2(
    'SELECT TABLE_SCHEMA AS p1, TABLE_NAME AS p2 FROM information_schema.TABLES LIMIT 3;',
    'PROCEDURE',
    'p_echo'
);

We passed the query producing the results we want to loop. Then we told foeach_2() that we want to use a callback procedure, and specified its name.

Using a callback SQL statement

Writing a procedure to perform a simple tasks would be annoying. That’s why I decided to also support callback statements:

CALL foreach_2(
    'SELECT TABLE_SCHEMA AS p1, TABLE_NAME AS p2 FROM information_schema.TABLES LIMIT 3;',
    'SQL',
    'SELECT @p1 AS p1, @p2 AS p2;'
);

As you can see, the callback query can access the values using user variables: @p1 and @p2.

These variables exist at session level, so it is possible that we are overwriting something. But I think there are no reasonable ways to avoid these collisions.

Enjoy!

Prepared Statements in MariaDB 10.2

Prepared statements are particularly useful in stored procedures. In fact, they are the only way to execute dynamic SQL, which means that, for example, the list of ORDER BY columns is in a variable. You can do this by composing an SQL string, as you do in other languages.

However, in MySQL and up to MariaDB 10.1, there are some annoying limitations:

  • Prepared statements exist at a connection level. Even if you declare them in a stored procedure, they are global. If you are writing a stored procedure which should be usable in many contexts (maybe it’s part of an open source library), there is a risk that you overwrite existing prepared statements. There is no way to get a list of existing prepared statements.
  • A prepared statement is prepared from a literal string or a user variable. A literal string cannot contain variables. A user variable exists at a session level, so again, there is a risk to overwrite something. Especially if you always call the variable @sql, because a lot of people tends to do so.
  • Prepared statements are verbose. You need to run 3 different statements (PREPARE, EXECUTE, DEALLOCATE PREPARE) just to execute a dynamic SQL statement.

First, let me say that the first problem is not solved. I hope it will be solved in a future version, so it will be easier to distribute stored procedures libraries. After all, MariaDB 10.3 will have several improvements to stored procedures.

The second problem is solved in 10.2. You can pass any SQL expression (or, at least, all expressions I tested) to PREPARE. Including a local variable, which means that there will be no conflicts with variables declared by the user. For example:

DECLARE v_query TEXT DEFAULT 'SELECT COUNT(DISTINCT user) FROM mysql.user';
PREPARE stmt FROM v_query;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

But this example is verbose. It demonstrates what I stated above: you need at least 3 statements just to execute a dynamic query. But MariaDB 10.2 solves this problem introducing EXECUTE IMMEDIATE, which is basically a shortcut:

DECLARE v_query TEXT DEFAULT 'SELECT COUNT(DISTINCT user) FROM mysql.user';
EXECUTE IMMEDIATE v_query;

As you can see, we don’t need PREPARE or DEALLOCATE PREPARE if we use EXECUTE IMMEDIATE. This is nice to have in stored procedures, but is not useful outside of procedures, because the prepared statement is always reprepared, even if we executed it before.

Enjoy!

 

Thoughts on MaxScale new license

MaxScale has been open source until now, just like all MariaDB projects. But the 2.0 version is released under a new license called BSL, which basically makes the covered work non-free until the Change Date (in this case 2019-01-01), when the license will be converted to GPL.

Looks like open source friendly, after all. The license will be GPL, just be patient. And the code is available. Right?

No. Cmpletely wrong. For plenty of reasons.

Some reasons

It is a lock-in. No matter how many times Monty repeats that there is no lock-in, we have a brain. If you don’t allow anyone to fix bugs except for yourself, it is a lock-in. If you force your users to buy your support, they won’t buy your competitors support.

MariaDB business moves to a non-free product. Yes, 1.4 is free an this won’t change. And yes, when 3.0 will be out, 2.0 will be free. But why should they maintain a free version, if money comes from non-free versions? Monty says that open source religion doesn’t put bread on the table. I suppose that maintaining free branches also doesn’t put bread on the table.

I wasn’t able to find any official EOL date for any MaxScale version – if there is one, please comment below.

MariaDB moves innovation to the non-free world. New features are non-free. When they will be old, they will be free. Monty also stated that this is the correct way to make money for a lot of projects. And he seems to advice this model to start-ups that use his venture capital, OpenOcean. Suddenly, BSL seems to be the only way for projects to survive. Is he protecting others projects interests, or using them for his own marketing?

MariaDB accused Oracle several times. When Oracle implemented a couple features and only distributed them in a non-GPL edition (threadpool, PAM authentication), MariaDB told that they had the same features as open source. Which was great. Except that… now MySQL Router is open source, MaxScale 2.0 is not. Now Monty has several justifications for this. But I fail to understand why open core is evil and BSL is good.

I mentioned Monty too many times. Is this an attack against Monty? Definitely not, but all articles I could find express Monty’s opinion, not MariaDB Corporation or anyone else’s opinion. I cannot answer the silence.

What is the MariaDB Foundation?

MariaDB Corporation has the legal right to make MaxScale non-free. They own it. They sometimes call it MariaDB MaxScale. They can: they also own MariaDB trademark.

So, what’s the role of MariaDB Foundation?

They claim they safeguard MariaDB. They don’t mention the ecosystem, the community, or other tools. They don’t mention, of course, MaxScale. Which is quite strange: they claimed that their model is Apache Foundation, which supports an entire ecosystem in many ways, and owns the trademarks.

Also, the board of directors has 6 members. 3 are from MariaDB Foundation. In this situation, they cannot have an independent opinion on MariaDB Corporation actions.

A curious aspect is that they declare they follow Ubuntu Code of Conduct. Please read its last paragraph and drawn your own conclusions.

My position on MariaDB and MaxScale

I am still grateful to MariaDB Corporation for creating and maintaining MariaDB (and to some of their engineers for creating MySQL).

From a technical point of view, they have many interesting features that are not in MySQL. Some of them come from the community, for example the CONNECT engine and their implementation of encryption. And the reason is that MariaDB is very open to the community.

Which brings us to a less technical point of view: MariaDB openness. Their JIRA account allows us to see the bugs (including their current status…). You can also see who is working on what, when next versions will be released, and what they will have. The team is active on the mailing lists and IRC. The documentation is a wiki and the license is free.

I have been a MariaDB supporter for years. I wrote Mastering MariaDB and I am one of their Community Ambassadors chosen by Colin Charles (who recently left MariaDB). Will my position about MariaDB project change? I don’t know, it’s too early to answer. For sure, I won’t deny that its openness is amazing and should be a model for everyone. (And I hope this won’t change)

And my position about MaxScale has changed? Of course it did. I wouldn’t use it for personal projects. Of course I could provide support but, given the license change, it seems to me unlikely. There are free alternatives: ProxySQL, MySQL Router, HAProxy. PoxySQL is by far the most interesting, if you ask me.

My position has changed forever? The answer depends on another question: will MariaDB fix its big mistake? I have no logic reasons to be optimistic, but I still hope it will. In the past they have apparently been open to criticism. After a complain in this blog, they made MaxScale binaries freely available, and I wrote a thank you post. What I couldn’t know is that they were preparing to close MaxScale next versions.

 

MySQL/MariaDB cursors and temp tables

In MariaDB and MySQL, cursors create a temporary table.

Does this statement deserve a whole blog post? Apparently not. However, in some cases one does not expect a temporary table to be created:

  • SELECT ... FOR UPDATE: An exclusive lock is created, yes, but you still read data from a temporary table.
  • SELECT FROM a temporary table: you are reading from a temporary tables, yes, but an internal temporary table is created anyway.
  • Impossible WHERE and LIMIT 0.

A quick example:

CREATE TEMPORARY TABLE t
(
        id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY
);

CREATE PROCEDURE p()
BEGIN
        DECLARE c CURSOR FOR
                SELECT id FROM t WHERE 0 LIMIT 0 FOR UPDATE;
        OPEN c;
        CLOSE c;
END;

MySQL [test]> SHOW STATUS LIKE 'Created_tmp_tables';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| Created_tmp_tables | 31    |
+--------------------+-------+
1 row in set (0.00 sec)

MySQL [test]> CALL p();
Query OK, 0 rows affected (0.00 sec)

MySQL [test]> SHOW STATUS LIKE 'Created_tmp_tables';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| Created_tmp_tables | 32    |
+--------------------+-------+
1 row in set (0.00 sec)

I am not complaining, and I don’t even know if this behavior can be changed. But one should certainly be aware of this behavior. For example, one could think that creating a temporary table one time and then loop on that table with cursors several times is an optimization – but that’s not the case.

Federico

MaxScale binaries are now available

UPDATE: Short story: ignore the next update and read the post. Long story: the original post was a mistake, as explained in the next update. But then, MariaDB released free MaxScale binaries and everything I wrote in the post is now correct.

UPDATE 2016-04-14: It seems that I was mistaken. MaxScale download page is a bit different from MariaDB Enterprise page, and does not explicitly require us to accept terms of use before download. But we accept those terms while creating an account.

So, MaxScale binaries cannot be used in production without paying for MariaDB enterprise. Thanks to the persons who commented this post and pointed my mistake. My apologies to my readers.

I won’t delete this post because I don’t want the comments to disappear, as they express opinions of some community members.

My jestarday’s post Comments on MaxScale binaries followed up a post from Percona’s blog. It had much more visits than any other post I wrote before. It was linked by Peter Zaitsev and Oli Senhauser on social networks. No, this is not a self-advertisement, I’m just saying that the problem I’ve talked about is considered important by the community.

Today, MaxScale binaries are available! Not because of me (obviously), but because MariaDB must have found out that the community badly wants those binaries.

MaxScale 1.4.1 was released today, and it is available from the Database Downloads page on MariaDB.com. You can click on MaxScale and then you can select the version (1.4.1, 1.3.0, 1.2.1) and the system (Debian, Ubuntu, RHEL/CentOS, SLES, both available as deb/rpm or tarball). Registration is required for download, but this is acceptable, as long as the binaries are freely available.

There are no restrictive terms of use. Here is how the copyright note starts:

This source code is distributed as part of MariaDB Corporation MaxScale. It is free
software: you can redistribute it and/or modify it under the terms of the
GNU General Public License as published by the Free Software Foundation,
version 2.

The only problem is the lack for a repository – but now that the binaries are freely available, I expect most Linux distros to provide their packages.

I downloaded the Ubuntu 14.04 version on my Mint machine, and everything worked as expected:

fede-mint-0 ~ # dpkg -i /home/federico/Downloads/maxscale-1.4.1-1.ubuntu_trusty.x86_64.deb 
Selecting previously unselected package maxscale.
(Reading database ... 184280 files and directories currently installed.)
Preparing to unpack .../maxscale-1.4.1-1.ubuntu_trusty.x86_64.deb ...
Unpacking maxscale (1.4.1) ...
Setting up maxscale (1.4.1) ...
Processing triggers for man-db (2.6.7.1-1ubuntu1) ...
Processing triggers for libc-bin (2.19-0ubuntu6.7) ...
fede-mint-0 ~ # maxadmin -uadmin -pmariadb
MaxScale> show servers
Server 0x1b7a310 (server1)
	Server:                              127.0.0.1
	Status:                              Auth Error, Down
	Protocol:                    MySQLBackend
	Port:                                3306
	Node Id:                     -1
	Master Id:                   -1
	Slave Ids:                   
	Repl Depth:                  -1
	Number of connections:               0
	Current no. of conns:                0
	Current no. of operations:   0

So, thanks MariaDB! I love software projects that listen to their community needs. This should be a lesson for another company – we all know who I am talking about.

Federico

Comments on MaxScale binaries

I’m writing this post after reading Downloading MariaDB MaxScale binaries, from Percona’s MySQL Performance Blog.

I was already aware about the problem: MaxScale is open source, but the binaries are not free. You can download them and use them for testing, but if you want to use them in production, you’ll need to buy MariaDB Enterprise.

Note that MaxScale Docker images seem to have the same problem. I’ve tried some of them, but all those I’ve tried were running MariaDB Enterprise binaries, so using them in production is illegal (unless you pay).

The alternative is… compiling MaxScale. I had problems in doing so and couldn’t solve those problems myself. From the MaxScale Google Group, I see that Justin Swanhart had the same problems… so I don’t feel particularly stupid for that.

After some questions on that group, 2 posts were published on MariaDB blog:

When I tried them, the first one worked for me, while the former didn’t.

But in any case, even if you are able to compile MaxScale on your Linux/BSD of choice, updates will be a problem. You will need to compile all next releases of MaxScale, which is simply not a viable solution for many companies.

This prevents MaxScale from being as widely used as it could be. The lack of free binaries is a problem. I understand their commercial choice – it is legit, but I don’t agree. First, because open source shouldn’t work in this way. Second, because the lack of free binaries could bring some customers to them, but… most probably, it simply prevents lots of people from trying MaxScale at all.

This would be negative for any software project, in my opinion. But I think that it is particularly negative for MaxScale. Why? Because it is amazingly versatile and writing a new module is amazingly simple. This combination of characteristics would make it greatly benefit from being widely adopted – people could write new modules and distribute them.

Please, MariaDB, reconsider this choice.

Federico