How MariaDB makes Stored Procedures usable

I already wrote how MariaDB makes the debug of stored procedures much easier via the SQL Error Log. But I recently found out that MariaDB provides a usable workaround for some big limitations of their procedural SQL.

First, SELECT is not the only SQL statement which returns a resultset. Other examples are DELETE RETURNING, CALL, SHOW, EXPLAIN and administrative commands like ANALYZE TABLE or CHECK TABLE. But these commands cannot be used in place of SELECT in the following contexts:

  • Subqueries, derived tables, JOINs, UNIONs
  • CREATE TABLE ... SELECT
  • INSERT/REPLACE INTO ... SELECT
  • CREATE VIEW ... AS SELECT
  • DECLARE CURSOR ... SELECT
  • SET @my_var := SELECT

This means that there is no way to read the output of these statements within a stored procedure or trigger.

Also, some statements cannot be executed in a stored procedures and/or in prepared statements. An example is CHECK TABLES: even if you don’t care about the resultset (which would be a serious limitation), you cannot execute it in a stored procedure or in a prepared statements.

MariaDB and MySQL support the DO statement, which is identical to SELECT except that it doesn’t return any result. However, stored procedures could return one or more resultsets too, and you cannot invoke them with DO. This means that in some contexts you may have to deal to annoying resultsets, or perhaps you cannot call those procedures (within triggers or functions).

I faced all these problems while developing STK/Unit and in other projects.

But MariaDB lets us workaround these limitations – and the funny thing is that probably its developers are not aware about this! Here is an example:

CREATE OR REPLACE TABLE show_master_status
	ENGINE = CONNECT
	TABLE_TYPE = MYSQL
	SRCDEF = 'SHOW MASTER STATUS'
	CONNECTION = '_'
;

DELIMITER ||
CREATE PROCEDURE test()
BEGIN
	DECLARE eof BOOL DEFAULT FALSE;
	DECLARE v_file, v_position TEXT;
	
	DECLARE cur CURSOR FOR
		SELECT `File`, `Position` FROM show_master_status;
	
	DECLARE CONTINUE HANDLER
		FOR NOT FOUND
	BEGIN
		SET eof := TRUE;
	END;
	
	OPEN cur;
	`lbl_loop`:
	LOOP
		FETCH cur INTO v_file, v_position;
		IF eof IS TRUE THEN
			LEAVE lbl_loop;
		END IF;
		
		SELECT CONCAT('I''ll do something with ', v_file, v_position);
	END LOOP;
	CLOSE cur;
END ||
DELIMITER ;

MariaDB [_]> CALL test(); 
+--------------------------------------------------------+ 
| CONCAT('I''ll do something with ', v_file, v_position) | 
+--------------------------------------------------------+ 
| I'll do something with binlog.00000242965              | 
+--------------------------------------------------------+ 
1 row in set (0.01 sec) 

Query OK, 0 rows affected (0.01 sec)

As you can see, with this trick I was able to use a cursor with SHOW MASTER STATUS.

How does the trick work? If you know CONNECT, probably you already guessed. CONNECT is a storage engine which allows the users to read data from several kind of data sources as if they were regular SQL tables. Supported data sources include several file formats, remote DBMS’s and more. In this case I defined a table which connects to a MariaDB/MySQL server (TABLE_TYPE=MYSQL). The server is… localhost. Since I specified a SRCDEF table option, the data source is not a table in the server, it is the resultset of the specified query (SHOW MASTER STATUS). This means that, when the table is queried, the CONNECT engine opens a connection to the local server, it executes SHOW MASTER STATUS, and it returns the results of that statement. But since this work is done by the CONNECT engine, from the MariaDB point of view we are just querying a table. So, we can use that query in all contexts where SELECT works.

Also note that we don’t have to specify the table structure. CONNECT supports table discovery, which means that it automatically knows which columns and types are needed.

Confused? I hope not. Look again at the example – everything will be clear.

Of course, having to CREATE a TABLE each time is not very comfortable, even with table discovery. So I wrote a stored procedure which does the boring work for us:

UPDATE:
I removed the code because it was buggy triggered MariaDB bugs. See this post and this GitHub repository for working code.

Enjoy!

Advertisements

MariaDB/MySQL: Making ENGINE clause mandatory

I got this idea from a Valerii Kravchuk’s MySQL bug report:

http://bugs.mysql.com/bug.php?id=71978

In theory, I completely agree that MySQL and forks should not allow us to set a default storage engine which cannot be used to create a table. You can see the same with MariaDB’s SEQUENCE. The MySQL & forks philosophy seems to be: ignore your mistakes, so you can repeat them forever. Which can turn a mistype into a major data loss.

Unless you only use InnoDB and your magic powers tell you that this will never change, the ENGINE clause should be mandatory in your MySQL installation. Since there is no clean way to make it mandatory, setting a “weird” storage engine as default seems to be a decent workaround. I don’t like it, but it can prevent human mistakes.

MariaDB [test]> SET SESSION default_storage_engine = 'performance_schema';
Query OK, 0 rows affected (0.00 sec)

MariaDB [test]> CREATE TABLE x (id INT) ;
ERROR 1005 (HY000): Can't create table `test`.`x` (errno: 131 "Command not supported by database")
Error (Code 1005): Can't create table `test`.`x` (errno: 131 "Command not supported by database")
Error (Code 1683): Invalid performance_schema usage.

Enjoy!

MariaDB storage engines

This is a list of MariaDB storage engines that are not distributed with MySQL. I think that most of them will work with MySQL, but not all – at least CassandraSE doesn’t.

Engine Description Introduced
XtraDB A fully-compatible fork of InnoDB, mantained by Percona Big Bang
Aria A crash-safe MyISAM, also used for internal temptables Big Bang
TokuDB A transactional engine with innovative buffers and high compression, by TokuTek 10.0
mroonga Supports fulltext searches with Chinese, Japanese and Korean languages 10.0
SPIDER Shards tables through multiple servers 10.0
OQGRAPH Used to implement tree structures 10.0
Sequence Returns a sequence of numbers as a resultset 10.0
CONNECT Support several external data sources (data files, DBMS’s…) 10.0
CassandraSE A bridge to Apache Cassandra 10.0
SphinxSE A bridge to Sphinx 5.2
FederatedX A richer fork of Federated Big Bang

Notes:

  • XtraDB is the default and should be preferred to InnoDB, unless it has a bug which affects you (but XtraDB fixes several InnoDB bugs).
  • OQGRAPH was introduced during the Big Bang or soon after, but disabled in 5.5 and reintroduced in 10.0. See this post.
  • FederatedX is called Federated, so it’s possible that you are already using it and you don’t know.
  • CONNECT will hopefully obsolete CSV and FederatedX, but some features are still missing. For example, CONNECT cannot store NULL values in flat files.
  • With MariaDB you need to specify SQL_MODE='NO_ENGINE_SUBSTITUTION'. If you mistype the engine name, or try to use a disabled engine, you want to get an error, rather than silently switch to XtraDB.

Obsolete storage engines:

PBXT (Big Bang – 5.3) – A transactional engine which aimed to be “somewhere between InnoDB and MyISAM”. Seemed to support foreign keys better than InnoDB (I couldn’t reproduce a tirgger+FK bug and it had more options), but alas it’s dead.

Enjoy!

MariaDB’s Sequence Storage Engine

Versione italiana

MariaDB 10.0.3 introduces a new Storage Engine: Sequence. It isn’t in MySQL or in older MariaDBs versions. Sequence is a special engine, which does not create or read any table. It only generates on the fly a sequence of integers which is returned to your query, and then dies. The sequence’s bounds and increment depend from the table name.

Very quick start

To install:

INSTALL PLUGIN sequence SONAME 'ha_sequence';

NOTE: On MariaDB (unlike Oracle MySQL) there is no need to add ‘.dll’ to the command on Windows, so the command is platform-independent. Thanks Vladislav Vaintroub for this information!

To create a sequence, you must first select a default database, using the USE command. Then issue this query:

SELECT * FROM seq_1_to_100;

To create a sequence specifying an increment different from 1:

SELECT * FROM seq_1_to_100_step_2;

A sequence can’t involve negative numbers, but can be descending:

SELECT * FROM seq_10_to_1;

Please, note that descending sequences always generate the same number as ascending sequence, in reverse order. This means that the highest number of a descending sequence sometimes is not the one you typed:

MariaDB [test]> SELECT * FROM seq_8_to_1_step_3;
+-----+
| seq |
+-----+
|   7 |
|   4 |
|   1 |
+-----+
3 rows in set (0.00 sec)

What sequence is not

Sequence Storage Engine is not an SQL SEQUENCE like the ones you can use in PostgreSQL or IBM, or FirebirdSQL’s generators. This means that a sequence exists only during the execution of a query. You cannot use a value from a sequence in a query and then use the next value in another query.

Also, sequence can’t generate negative numbers and can’t rotate when it reaches the maximum/minimum value (like they do in PostgreSQL).

Some sequence use cases

Other people don’t believe it, but developers have a lot of fantasy. It is needed to solve problems when there isn’t an obvious solution, and can’t find an algorithm designed by someone else. So, I’m sure that creative uses of the sequence Storage Engine will be found in the future. But for now, I’d like to share some ideas.

Almost all the following techniques can be used to populate tables with generated data. This can be useful, because when it can be done using some queries, you avoid network communications – which can be slow or error prone.

Find holes in a column

Sometimes a column should contain a sequence of integer values and holes shouldn’t be there. A common example is a column used to sort some elements in a web page: while holes shouldn’t be a problem for queries, they can make re-sorting operations buggy. In the following example we’ll create a table with a lot of holes, and then we’ll list the missing values using a sequence:

CREATE TABLE t1 (c TINYINT UNSIGNED) ENGINE=InnoDB;
INSERT INTO t1 VALUES (1), (2), (3), (5), (10);
SELECT s.seq FROM seq_1_to_10 s LEFT JOIN t1 t ON s.seq = t.c WHERE t.c IS NULL;

Build a combination of numbers

SELECT s1.seq, s2.seq FROM seq_1_to_3 s1 JOIN seq_1_to_3 s2 ORDER BY 1, 2;

But you can also combinate numbers and characters (see below).

Find multiples of 3, minor than 100

SELECT seq FROM seq_3_to_100_step_3;

Find multiples of both 3 and 5

SELECT s1.seq FROM seq_5_to_100_step_5 s1 INNER JOIN seq_3_to_100_step_3 s2 ON s1.seq = s2.seq;

Optimized: find multiples of (2 or one of its powers)

SELECT seq FROM seq_1_to_100 WHERE NOT seq & 1; -- ...of 2
SELECT seq FROM seq_1_to_100 WHERE NOT seq & 3; -- ...of 4
SELECT seq FROM seq_1_to_100 WHERE NOT seq & 7; -- ...of 8
SELECT seq FROM seq_1_to_100 WHERE NOT seq & 15; -- ...of 16
...

Build a sequence of 1-char strings

The following query includes lowercase chars, uppercase chars and digits. But I’m sure that anyone can modify it to exlude one of these groups, or include more groups (for example, accented vowels).

-- Sequence of 1-char strings
SELECT CHAR(seq) AS ch
    FROM (
                -- lowercase
                (SELECT seq FROM seq_97_to_122 l)
            UNION
                -- uppercase
                (SELECT seq FROM seq_65_to_90 u)
            UNION
                -- digits
                (SELECT seq FROM seq_48_to_57 d)
        ) ch;

Build a sequence of 2-char strings

Ouch… this UNION is tricky. But someone could still use it.

SELECT CONCAT(ch1.ch1, ch2.ch2) AS ch
    FROM (
        (SELECT CHAR(seq) AS ch1
            FROM (
                        -- lowercase
                        (SELECT seq FROM seq_97_to_122 l1)
                    UNION
                        -- uppercase
                        (SELECT seq FROM seq_65_to_90 u1)
                    UNION
                        -- digits
                        (SELECT seq FROM seq_48_to_57 d1)
                ) s1
        )
    ) ch1
    CROSS JOIN (
        (SELECT CHAR(seq) AS ch2
            FROM (
                        -- lowercase
                        (SELECT seq FROM seq_97_to_122 l2)
                    UNION
                        -- uppercase
                        (SELECT seq FROM seq_65_to_90 u2)
                    UNION
                        -- digits
                        (SELECT seq FROM seq_48_to_57 d2)
                ) s2
        )
    ) ch2
    ORDER BY ch1, ch2;

Sorry for the aliases confusion (particularly ch1 and ch2). I ran out of fantasy before finishing the query.

Sequence of dates

This is interesting, in my opinion. Many booking applications, during the installation, populate a table with days or hours when a reservation could start. This example shows a simple way to fill a table with all days in January:

SELECT DATE ('2014.01.01' + INTERVAL (s.seq - 1) DAY) AS d
    FROM (SELECT seq FROM seq_1_to_30) s;

Sequence of hours, halfes of an hour, etc

-- Hours in a day
SELECT CAST('00:00:00' AS TIME) + INTERVAL (s.seq - 1) HOUR AS t
    FROM (SELECT seq FROM seq_1_to_24) s;
-- Halfes of an hour in a day
SELECT CAST('00:00:00' AS TIME) + INTERVAL (30 * s.seq) MINUTE AS t
    FROM (SELECT seq FROM seq_1_to_48) s;

Working days

The last two examples are not so useful, alone. Usually you don't need to fill a table with all days in a month/year, or all hours in a day. Instead, you only need to write working days and working hours. To find working days, you can use DAYOFWEEK() function to exclude saturdays and sundays. Of course this doesn't automatically exclude christmas or other holidays - but I think that the best solution is deleting them after filling the table.

SELECT DATE ('2014-01-01' + INTERVAL (s.seq - 1) DAY) AS d
    FROM (SELECT seq FROM seq_1_to_30) s
    -- exclude sunday (1) and saturday (7)
    WHERE DAYOFWEEK(DATE ('2014-01-01' + INTERVAL (s.seq - 1) DAY)) BETWEEN 2 AND 6;

Just a note on implementation. Why do I start dates from day 1 and then I perform a - 1, rather than directly starting from day 0? For two reasons. First, in Setting a strict SQL_MODE article, I explain that the NO_ZERO_DATE and NO_ZERO_IN_DATE flags in SQL_MODE should be set. Second: '2014-01-00' + INTERVAL 1 DAY returns NULL, and it is the correct behavior, not a bug. If you don't understand the reason (but also if you understand it), please set those flags. Permissive SQL_MODEs pretend to be your friends, but one of these days they will kill you!

Working hours

SELECT {dt '2013-01-01 01:00:00'} + INTERVAL (wd.d - 1) DAY + INTERVAL (dh.h - 1) HOUR AS wh
    FROM
        (
            -- working days in a month
            SELECT seq AS d FROM seq_1_to_30
        ) wd
    CROSS JOIN
        (
            -- daily working hours
            (SELECT seq AS h FROM seq_9_to_12)
            UNION
            (SELECT seq AS h FROM seq_14_to_17)
        ) dh
    ORDER BY 1;

Here we have one subquery returning the working days, JOINed with a subquery returning working hours. Note that working hours have a hole (workers lunch), so the second subquery is a UNION of 2 sequences.

And... yes, carthesian product is evil, but here it's exactly what we need. The CROSS keyword makes clearer that if you modify the query and make a mistake, you could create a SELECT returning a big number of rows.

To-Do and suggestions

In a future article, I will show a Stored Function which dynamicall composes a query to populate a table with DATETIME values, in a given time range, with a given granularity.

Do you think that one of these query is not well-optimized? Do you have other interesting ideas which involve the sequence Storage Engine? Please, comment this article.

Enjoy!