MariaDB and Stored Procedures: errata & repository

In the article How MariaDB makes Stored Procedures usable I explained how to use the MariaDB CONNECT Storage Engine to workaround Stored Procedures limitations. Please read that article, if you didn’t, for details.

Since the technique to workaround such limitations is verbose and error-proof, I also proposed a procedure which makes it easy and convenient. Even thought this is a simple country blog, I have immediately received positive feedback after that article. But then, people started to write me: “Hey, doesn’t work!”. The reason is… MariaDB/MySQL bugs. While investigating these problems, I was able to report the following bugs:

I’m sure there are others, but isolating such bugs to produce decent test cases can be difficult. I will do, if I can.

Basically, while developing the procedures I needed, was that some combinations of (correct) statements could misteryously trigger bugs that are not triggered by other combinations of statements. Despite those procedures being trivial, testing their behavior was frustrating.

I was able to develop a version of these procedures which does not trigger any bug, and I put them in the materialize repository, on GitHub.

I can finally suggest people to try them. If you have any problems (becuase there are bugs in the procedures, on in MariaDB, or you don’t understand something) feel free to contact me at info [At, federico - razzoli /Dot, com (replace [At, with @ and /Dot, with .). I will help. But be very patient or pay me: remember that 3 days are not a long time for a free service. :)

Enjoy!

How MariaDB makes Stored Procedures usable

I already wrote how MariaDB makes the debug of stored procedures much easier via the SQL Error Log. But I recently found out that MariaDB provides a usable workaround for some big limitations of their procedural SQL.

First, SELECT is not the only SQL statement which returns a resultset. Other examples are DELETE RETURNING, CALL, SHOW, EXPLAIN and administrative commands like ANALYZE TABLE or CHECK TABLE. But these commands cannot be used in place of SELECT in the following contexts:

  • Subqueries, derived tables, JOINs, UNIONs
  • CREATE TABLE ... SELECT
  • INSERT/REPLACE INTO ... SELECT
  • CREATE VIEW ... AS SELECT
  • DECLARE CURSOR ... SELECT
  • SET @my_var := SELECT

This means that there is no way to read the output of these statements within a stored procedure or trigger.

Also, some statements cannot be executed in a stored procedures and/or in prepared statements. An example is CHECK TABLES: even if you don’t care about the resultset (which would be a serious limitation), you cannot execute it in a stored procedure or in a prepared statements.

MariaDB and MySQL support the DO statement, which is identical to SELECT except that it doesn’t return any result. However, stored procedures could return one or more resultsets too, and you cannot invoke them with DO. This means that in some contexts you may have to deal to annoying resultsets, or perhaps you cannot call those procedures (within triggers or functions).

I faced all these problems while developing STK/Unit and in other projects.

But MariaDB lets us workaround these limitations – and the funny thing is that probably its developers are not aware about this! Here is an example:

CREATE OR REPLACE TABLE show_master_status
	ENGINE = CONNECT
	TABLE_TYPE = MYSQL
	SRCDEF = 'SHOW MASTER STATUS'
	CONNECTION = '_'
;

DELIMITER ||
CREATE PROCEDURE test()
BEGIN
	DECLARE eof BOOL DEFAULT FALSE;
	DECLARE v_file, v_position TEXT;
	
	DECLARE cur CURSOR FOR
		SELECT `File`, `Position` FROM show_master_status;
	
	DECLARE CONTINUE HANDLER
		FOR NOT FOUND
	BEGIN
		SET eof := TRUE;
	END;
	
	OPEN cur;
	`lbl_loop`:
	LOOP
		FETCH cur INTO v_file, v_position;
		IF eof IS TRUE THEN
			LEAVE lbl_loop;
		END IF;
		
		SELECT CONCAT('I''ll do something with ', v_file, v_position);
	END LOOP;
	CLOSE cur;
END ||
DELIMITER ;

MariaDB [_]> CALL test(); 
+--------------------------------------------------------+ 
| CONCAT('I''ll do something with ', v_file, v_position) | 
+--------------------------------------------------------+ 
| I'll do something with binlog.00000242965              | 
+--------------------------------------------------------+ 
1 row in set (0.01 sec) 

Query OK, 0 rows affected (0.01 sec)

As you can see, with this trick I was able to use a cursor with SHOW MASTER STATUS.

How does the trick work? If you know CONNECT, probably you already guessed. CONNECT is a storage engine which allows the users to read data from several kind of data sources as if they were regular SQL tables. Supported data sources include several file formats, remote DBMS’s and more. In this case I defined a table which connects to a MariaDB/MySQL server (TABLE_TYPE=MYSQL). The server is… localhost. Since I specified a SRCDEF table option, the data source is not a table in the server, it is the resultset of the specified query (SHOW MASTER STATUS). This means that, when the table is queried, the CONNECT engine opens a connection to the local server, it executes SHOW MASTER STATUS, and it returns the results of that statement. But since this work is done by the CONNECT engine, from the MariaDB point of view we are just querying a table. So, we can use that query in all contexts where SELECT works.

Also note that we don’t have to specify the table structure. CONNECT supports table discovery, which means that it automatically knows which columns and types are needed.

Confused? I hope not. Look again at the example – everything will be clear.

Of course, having to CREATE a TABLE each time is not very comfortable, even with table discovery. So I wrote a stored procedure which does the boring work for us:

UPDATE:
I removed the code because it was buggy triggered MariaDB bugs. See this post and this GitHub repository for working code.

Enjoy!

GNU Affero GPL license: (trying to) debunking myths

There are a lot of misconceptions about the GNU AGPL license. I’m writing this brief article to try debunking them and explain what this license really is.

Let’s start from the GNU GPL. It makes a licensed piece of code (the covered work) forever free. If you download/receive a copy of a GPL’d software, you can use it in any way, study the code, modify the code, redistribute the software as is or a modified version. There are some limitations: if you redistribute the software, you can’t do things which limit the freedom of your users. For example, you can’t distribute the software as closed source.

This is considered as a strong limitations by non-free software producers and even by some free software producers. I completely disagree. Whoever can produce non-free software; but why should free software developers help them in doing so? Do non-free software developers help us in any way? No, they don’t.

However, GPL has a deficiency. The hey is the concept of distribution. If Jenny gives to Butch an USB stick containing a software, she’s distributing it. The same applies if she sends the software via email. In these cases, the GPL grants the basic rights of the end user. When the GPL was written, this was enough.

Nowadays, a software is often used via a network. No distribution takes place: Butch connects to Jessie’s server, and uses her copy of a software. He doesn’t receive the software itself, so the GPL cannot grant him any right. But… he is the end user! Most software users today don’t even know if the software they use is running on their hardware or somewhere else. In many cases, you just download a client. And no one even tells you it is just a client.

The AGPL addresses this problem. It is basically a GPL license with an additional clause: the section 13. This section is brief and very easy to understand (like most parts of the license) and the key part is: if you modify the Program, your modified version must prominently offer all users interacting with it remotely through a computer network (…) an opportunity to receive the Corresponding Source of your version by providing access to the Corresponding Source from a network server at no charge. This includes non-system libraries used by the software.

So, the AGPL simply does what the GPL cannot do anymore because we don’t run programs on our computers.

Of course you can use it on non-network software. It is generally a good idea: someday it might become network software, or a piece of your code could be used in a network software.

I can’t imagine possible reasons why one can honestly think that GPL is ethically right and AGPL is not. Of course there may be commercial reasons to pretend you think it. It depends on your company’s business model. If your company modifies GNU GPL software on your server, and you don’t want users to know this, then yes, you probably must pretend that AGPL violates your basic human rights.

License compatibility

There is however a caveat. Many people think that all free licenses can be used together. False: mixing incompatible licenses is just as illegal as copying Microsoft Windows for a friend. GPL is a great license which comes with a price: not all free licenses are compatible with it. AGPL is an even more free license, which comes with a greater price: even GPLv2 is not compatible with it.

Enjoy!

Debugging MariaDB stored procedures with the SQL Error Log

A very old version of the MySQL site contained a nice header: Speed, Power, Ease of Use. I don’t think that stored programs (routines, triggers, events) were supported, at that time. Now they are, and… developing them is amazingly hard.

There are many problems, for example the language is not flexible and the execution is sloooow. But the biggest problem is that there is no debug API.

In details, one of the biggest problems is that, if you have complex procedures (accessing multiple tables, having error handling, and/or calling other procedures) in practice you have no idea of what warnings occur within your procedures.

MariaDB 10.0 makes things much easier by adding the SQL_ERROR_LOG plugin. Well, I don’t think that they had stored programs in mind when they developed it. But what’s important for us is that… it makes stored programs debug decent.

This article explains how. Its use in application debugging is not covered here, because it seems to me quite obvious and I want to focus on stored procedures.

Enabling SQL_ERROR_LOG

We will use the SQL_ERROR_LOG plugin. Its purpose is to log SQL errors into a text file. To enable it:

INSTALL SONAME 'sql_errlog';

Note that specifying the file extension is not necessary; this makes the statement platform-independent. The MySQL syntax is still supported.

When the plugin is installed, the logging is always active. To stop the logging, uninstall the plugin:

UNINSTALL PLUGIN SQL_ERROR_LOG;

Unfortunately, you might see a warning like this:

Warning (Code 1620): Plugin is busy and will be uninstalled on shutdown

I don’t know any way to avoid this. But I don’t think that SQL_ERROR_LOG is meant to be ran on a production server, I consider it as a development tool. However, if this is a problem for you, just ask the list or file a bug.

Errors are written in a file called sql_errors.log, in the data directory:

MariaDB [test]> SELECT CONCAT(@@global.datadir, @@global.sql_error_log_filename) AS sql_errlog;
+--------------------------------------+
| sql_errlog                           |
+--------------------------------------+
| /usr/local/mysql/data/sql_errors.log |
+--------------------------------------+
1 row in set (0.00 sec)


To make debug and life easier, I rotate the files very often:

SET @@global.sql_error_log_rotate = 1;

The format is the following:

2014-06-30  1:22:30 root[root] @ localhost [] ERROR 1051: Unknown table 'test.unknown_table' : DROP TABLE `unknown_table`

Limitations

  • The session id is not written to the log. The hostname and the user account are written, but they don't help us distinguishing one session from another. I suppose it is meant for being executed by application developers on their local machines.
  • In theory, only errors are logged - no warnings, no notes. But keep on reading.
  • The file format is not safely parseable. It's designed for humans.

Debug Messages

Look at the following procedure:

DROP PROCEDURE IF EXISTS do_something;
DELIMITER ||
CREATE PROCEDURE do_something(IN p TEXT)
    READS SQL DATA
BEGIN
    DECLARE CONTINUE HANDLER
        FOR SQLSTATE '45000'
    BEGIN END;
    
    IF @debug = TRUE THEN
        SET GLOBAL sql_error_log_rotate = 1;
    END IF;
    
    IF @debug = TRUE THEN
        SET @error = CONCAT('p = ', IF(p IS NULL, 'NULL', QUOTE(p)));
        SIGNAL SQLSTATE '45000' SET
            MYSQL_ERRNO = 9999,
            MESSAGE_TEXT = @error;
    END IF;
    
    -- why do something? let's do nothing!
    DO NULL;
END ||
DELIMITER ;
SET @debug = TRUE;
CALL do_something('x');

Got the trick? If the @debug variable is set to TRUE, an error is produced. The error is immediately suppressed by an empty CONTINUE HANDLER, so the procedure's execution continues and nothing happens. Well, almost nothing: the error is logged!

So:

  1. Write your CONTINUE HANDLERs.
  2. Set @debug to 1.
  3. CALL your procedure.
  4. cat the sql_errors.log.

Warning Logging

Look at this example:

DROP PROCEDURE IF EXISTS do_something;
DELIMITER ||
CREATE PROCEDURE do_something()
    READS SQL DATA
BEGIN
    DECLARE CONTINUE HANDLER
        FOR SQLWARNING
    BEGIN
        DECLARE CONTINUE HANDLER
            FOR SQLSTATE '45000'
        BEGIN END;
        
        GET DIAGNOSTICS CONDITION 1
              @errno = MYSQL_ERRNO
            , @error = MESSAGE_TEXT
            ;
        IF @debug = TRUE THEN
            SIGNAL SQLSTATE '45000' SET
                MYSQL_ERRNO = @errno,
                MESSAGE_TEXT = @error;
        END IF;
        RESIGNAL;
    END;
    
    IF @debug = TRUE THEN
        SET GLOBAL sql_error_log_rotate = 1;
    END IF;
    
    -- here goes the 'real' procedure.
    -- let's produce a warning to trigger the main HANDLER...
    DO 1/0;
END ||
DELIMITER ;

I know: this is tricky. Writing this has been hard. But don't worry: reusing this template will be quite easy.

The main HANDLER catches the warnings (in this case division by 0, if you have a good SQL_MODE). Then we get the warning properties (GET DIAGNOSTICS) and we use them to issue an error (SIGNAL). The error is suppressed by the internal empty HANDLER, but it is logged. Finally, the error is RESIGNAL'd, because we don't want the debug code to suppress warnings.

A final note. Do you want to log warnings only if they appear within a particular portion of your procedure? You can. Just enclose those lines in a BEGIN ... END construct, and define your handlers inside the construct.

Enjoy!

MariaDB/MySQL: Making ENGINE clause mandatory

I got this idea from a Valerii Kravchuk’s MySQL bug report:

http://bugs.mysql.com/bug.php?id=71978

In theory, I completely agree that MySQL and forks should not allow us to set a default storage engine which cannot be used to create a table. You can see the same with MariaDB’s SEQUENCE. The MySQL & forks philosophy seems to be: ignore your mistakes, so you can repeat them forever. Which can turn a mistype into a major data loss.

Unless you only use InnoDB and your magic powers tell you that this will never change, the ENGINE clause should be mandatory in your MySQL installation. Since there is no clean way to make it mandatory, setting a “weird” storage engine as default seems to be a decent workaround. I don’t like it, but it can prevent human mistakes.

MariaDB [test]> SET SESSION default_storage_engine = 'performance_schema';
Query OK, 0 rows affected (0.00 sec)

MariaDB [test]> CREATE TABLE x (id INT) ;
ERROR 1005 (HY000): Can't create table `test`.`x` (errno: 131 "Command not supported by database")
Error (Code 1005): Can't create table `test`.`x` (errno: 131 "Command not supported by database")
Error (Code 1683): Invalid performance_schema usage.

Enjoy!

MariaDB storage engines

This is a list of MariaDB storage engines that are not distributed with MySQL. I think that most of them will work with MySQL, but not all – at least CassandraSE doesn’t.

Engine Description Introduced
XtraDB A fully-compatible fork of InnoDB, mantained by Percona Big Bang
Aria A crash-safe MyISAM, also used for internal temptables Big Bang
TokuDB A transactional engine with innovative buffers and high compression, by TokuTek 10.0
mroonga Supports fulltext searches with Chinese, Japanese and Korean languages 10.0
SPIDER Shards tables through multiple servers 10.0
OQGRAPH Used to implement tree structures 10.0
Sequence Returns a sequence of numbers as a resultset 10.0
CONNECT Support several external data sources (data files, DBMS’s…) 10.0
CassandraSE A bridge to Apache Cassandra 10.0
SphinxSE A bridge to Sphinx 5.2
FederatedX A richer fork of Federated Big Bang

Notes:

  • XtraDB is the default and should be preferred to InnoDB, unless it has a bug which affects you (but XtraDB fixes several InnoDB bugs).
  • OQGRAPH was introduced during the Big Bang or soon after, but disabled in 5.5 and reintroduced in 10.0. See this post.
  • FederatedX is called Federated, so it’s possible that you are already using it and you don’t know.
  • CONNECT will hopefully obsolete CSV and FederatedX, but some features are still missing. For example, CONNECT cannot store NULL values in flat files.
  • With MariaDB you need to specify SQL_MODE='NO_ENGINE_SUBSTITUTION'. If you mistype the engine name, or try to use a disabled engine, you want to get an error, rather than silently switch to XtraDB.

Obsolete storage engines:

PBXT (Big Bang – 5.3) – A transactional engine which aimed to be “somewhere between InnoDB and MyISAM”. Seemed to support foreign keys better than InnoDB (I couldn’t reproduce a tirgger+FK bug and it had more options), but alas it’s dead.

Enjoy!

Descending indexes in MariaDB

Since the dawn of time, MySQL indexes have a limit: they cannot be descending. Yes, ALTER TABLE and CREATE INDEX can contain the ASC and DESC keywords. It is perfectly legal in MySQL. But this does not mean that descending indexes are created. This is a well-known MySQL feature: when it cannot do something, it just pretends to. Well… someone thinks it is a feature. I think it is a bug (a bug is an unexpected behaviour), but what can we do.

The lack of support for descending indexes is only an issue when we need to create an index in which at least one column is ascending and at least one column is descending. For example, MySQL pretends to understand the following statement, but the resulting index won’t probably help us:

CREATE INDEX my_index ON my_table (my_column ASC, your_column DESC);

An ORDER BY my_column ASC, your_column DESC won’t be able to use this index.

MariaDB has a feature which provides a good workaround for the lack of descending indexes: the PERSISTENT columns. If you don’t know them, please check the MariaDB KnowledgeBase first.

You don’t know them but you didn’t follow the link, right? But you can’t escape my telepathic powers!

How can PERSISTENT columns help us creating mixed order index? If you didn’t already guess, I’ll tell you. Suppose we have a column A and we want it to be part of an ASC+DESC index. We’ll need to create a PERSISTENT column rev_A, whose values are calculated based on the values in A. The important thing here is that the order of rev_A is reversed. Here is an example:

CREATE TABLE `exam_scores`
(
	  `id` INT SIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY
	, `student_name` VARCHAR(200) NOT NULL
	, `exam_code` CHAR(1)
	, `exam_score` SMALLINT UNSIGNED NOT NULL
	, `exam_date` DATETIME NOT NULL
	
	-- auxiliary columns
	, `rev_exam_code` TINYINT UNSIGNED AS (255 - ORD(exam_code)) PERSISTENT
	, `rev_exam_score` SMALLINT SIGNED AS (0 - exam_score) PERSISTENT
	, `rev_exam_date` SMALLINT UNSIGNED AS (DATEDIFF('2050-01-01', exam_date)) PERSISTENT
	
	-- mixed indexes
	, INDEX `desc_exam_code` (`rev_exam_code`, `student_name`, `exam_code`)
	, INDEX `desc_exam_score` (`rev_exam_score`, `student_name`)
	, INDEX `desc_exam_date` (`rev_exam_date`, `student_name`, `exam_date`)
)
	ENGINE = InnoDB
	DEFAULT CHARACTER SET = 'utf8'
;
-- this is needed because some rev* columns return negative values
SET @@global.sql_mode := CONCAT(@@global.sql_mode, ',NO_UNSIGNED_SUBTRACTION');
SET @@session.sql_mode := @@global.sql_mode;
INSERT INTO `exam_scores`
		(`student_name`, `exam_code`, `exam_score`, `exam_date`)
	VALUES
		  ('Anthony Stark', 'F', 100, '1994-01-01')
		, ('Klark Kent', 'G', 60, '2000-01-01')
		, ('Corto Maltese', 'A', 50, '2007-01-01')
		, ('Tin Tin', 'Z', 900, '2005-01-01')
		, ('Philip Mortimer', 'C', 20, '2000-01-01')
		, ('Dago', 'M', 80, '2004-01-01')
	;

Yes, this table sucks. It is not relational, etc, etc. But we are talking about mixed order indexing in MariaDB, not relational theory.

Is it clear now? I hope so. Let’s see an example query. We want to list the students: recent exams first, and if two dates are identical, sort them alphabetically.

MariaDB [test]> EXPLAIN SELECT `student_name`, `exam_date`
    -> FROM `exam_scores`
    -> ORDER BY `rev_exam_date` ASC, `student_name` ASC;
+------+-------------+-------------+-------+---------------+----------------+---------+------+------+-------------+
| id   | select_type | table       | type  | possible_keys | key            | key_len | ref  | rows | Extra       |
+------+-------------+-------------+-------+---------------+----------------+---------+------+------+-------------+
|    1 | SIMPLE      | exam_scores | index | NULL          | desc_exam_date | 613     | NULL |    7 | Using index |
+------+-------------+-------------+-------+---------------+----------------+---------+------+------+-------------+
1 row in set (0.00 sec)

The desc_exam_score index shows that including the original column in the index can be unnecessary:

MariaDB [test]> EXPLAIN SELECT `student_name`, -`rev_exam_score` FROM `exam_scores` ORDER BY `rev_exam_score` ASC, `student_name` ASC;
+------+-------------+-------------+-------+---------------+-----------------+---------+------+------+-------------+
| id   | select_type | table       | type  | possible_keys | key             | key_len | ref  | rows | Extra       |
+------+-------------+-------------+-------+---------------+-----------------+---------+------+------+-------------+
|    1 | SIMPLE      | exam_scores | index | NULL          | desc_exam_score | 605     | NULL |    6 | Using index |
+------+-------------+-------------+-------+---------------+-----------------+---------+------+------+-------------+
1 row in set (0.00 sec)

-column is obviously a quick operation.

The example table shows how to create a reversed index part for numbers, dates and ASCII 1-char strings. Multi-char ASCII strings are quite easy, too. I don’t know how to create a reversed Unicode column. If you find a way, please leave a comment below.

Enjoy!