January | 2013 | FALSE IS NOT NULL

Note: In this post, I always refer to MariaDB. As far as I know, the techniques and the limitation I’m describing here also apply to Percona Server and MySQL, including 5.6. But since I didn’t test the examples, I just refer to MariaDB (tested on 5.5).

Terminology

A query is a SQL statement that returns some results (SELECT, SHOW, DESCRIBE, EXPLAIN, HELP).

A Stored Program is a Stored Routine, a Trigger or an Event.

A Stored Routine is a Stored Procedure or a Stored Function.

A Dynamic SQL statement is astring composed by a Stored Program, which will be executed as a Prepared Statement. Even within normal Prepared Statements you can insert some values, for example you can “pass” them an ID which will be used in the WHERE clause. But there are many limitations: you can’t compose the ORDER BY clause, or decide what columns should be SELECTed, for example. An example of Dynamic SQL is:

SET @sql = CONCAT('SELECT `', @col_name, '` AS `value` FROM `', @tab_name, '` LIMIT 1;');

The Problem

MariaDB cursors can’t be used with Dynamic SQL. Their syntax is:

DECLARE <cursor_name> CURSOR FOR <query>

But is a hard-coded SQL statement, and can not be a variable. So, if you can’t hard-code your query because it should contain some variable parts, you can’t use a cursor to read the result set.

Hey, wait a moment… can’t you? Well, actually you can. You must write some code and use some dirty tricks, but you can.

The solution (Code)

Here’s the code we will use to implement “Dynamic Cursors”. I will explain it later:

DELIMITER ||
 
 
DROP PROCEDURE IF EXISTS `dyn_lock`;
CREATE PROCEDURE `dyn_lock`(IN `query` TEXT, IN `lock_name` CHAR(64), IN `timeout` TINYINT UNSIGNED)
	NO SQL
	LANGUAGE SQL
	COMMENT 'Get a lock for dyn cursor'
BEGIN
	-- output from GET_LOCK()
	DECLARE lock_res TINYINT UNSIGNED;
 
	-- error in query
	DECLARE CONTINUE HANDLER
		FOR SQLEXCEPTION
		SIGNAL SQLSTATE VALUE '45000' SET
				MESSAGE_TEXT  = '[dyn_lock] Dynamic SQL returned an error';
 
	-- get lock or exit with err
	SET lock_res = GET_LOCK(`lock_name`, IFNULL(`timeout`, 5));
	IF (lock_res IS NULL) THEN
		SIGNAL SQLSTATE VALUE '45000' SET
				MESSAGE_TEXT  = '[dyn_lock] Could not acquire lock: Unknown error';
	ELSEIF (lock_res = 0) THEN
		SIGNAL SQLSTATE VALUE '45000' SET
				MESSAGE_TEXT  = '[dyn_lock] Could not acquire lock: Timeout expired';
	END IF;
 
	-- create dynamic view
	SET @dyn_sql = CONCAT('CREATE OR REPLACE VIEW `', lock_name, '` AS ', query, ';');
	PREPARE stmt_dyn_view FROM @dyn_sql;
	EXECUTE stmt_dyn_view;
	DEALLOCATE PREPARE stmt_dyn_view;
END;
 
 
DROP PROCEDURE IF EXISTS `dyn_cursor`;
CREATE PROCEDURE `dyn_cursor`(IN `query` TEXT, IN `timeout` TINYINT UNSIGNED, OUT x TEXT)
	READS SQL DATA
	LANGUAGE SQL
	COMMENT 'Simulate dynamic cursor'
BEGIN
	-- we assume there is only 1 column to be read;
	-- TEXT is good for any type
	DECLARE val TEXT;
	-- no more rows flag
	DECLARE eof BOOL;
 
	-- select data from a view. only the view is really dynamic
	DECLARE cur_dyn CURSOR FOR
		SELECT * FROM dyn_view;
 
	-- set eof flag
	DECLARE CONTINUE HANDLER
		FOR NOT FOUND
		SET eof = TRUE;
 
	CALL `dyn_lock`(`query`, 'dyn_view', `timeout`);
 
	-- open cursor on view
	OPEN cur_dyn;
 
	-- now that we have the cursor, we can release lock
	DO RELEASE_LOCK('dyn_view');
 	
	lp_dyn: LOOP
		IF eof = TRUE THEN
			LEAVE lp_dyn;
		END IF;
		
		FETCH cur_dyn INTO val;
	END LOOP;
 
	CLOSE cur_dyn;
 
	SET x = val;
END;
 
 
||
DELIMITER ;

Just few notes before explaining this technique:
* The lp_dyn loop, in the dyn_cursor() PROCEDURE, is the part where we do something with the cursor we created. I could have leave out that part because it’s not really part of the general solution, but I needed some code which demonstrates that my solution works. If you leave out that part, then you don’t need the x OUT parameter and the val local variable.
* The user should call dyn_cursor(), not dyn_lock(). The latter is called by the former. This will be explained later.
* dyn_cursor() properly works if you pass a query which only produces 1 column. The last value returned by the query is assigned to the x OUT parameter. SELECT the variable you pass, to check that your query works and the result set has been correctly read by the cursor.
* To test the procedure, you can CALL something like these (call them from different connections to test concurrency):

CALL `dyn_cursor`('SELECT `ENGINE` FROM `information_schema`.`ENGINES`', NULL, @var1);
CALL `dyn_cursor`('SELECT 1 AS `uno` FROM `information_schema`.`TABLES`', NULL, @var2);
CALL `dyn_cursor`('SELECT -;', NULL, @var2);

The solution (Theory)

The technique is quite simple. Here’s the work flow.

The dynamic query is received through the query IN parameter. This happens in my example – please, note that in real-world examples, you could compose that string in the same PROCEDURE. Or it could be a FUNCTION, instead of a PROCEDURE.

We need to acquire a lock, so we call dyn_lock(). It is not the table-level lock you would acquire with a LOCK TABLE statement, but another type of lock. It’s more like a logical named lock entirely handled by four SQL functions (GET_LOCK(), RELEASE_LOCK(), IS_USED_LOCK(), IS_FREE_LOCK()). This kind of locks is interesting, because you may want to acquire locks that are not bound to a table, or you are using techniques that are affected by table locks in a strange way (transactions, triggers, etc). I will not fully explain the mechanism in this post, but here’s what we do. We acquire a lock named 'dyn_view' using GET_LOCK(). If that lock has been acquired by another connection, we wait for the number of seconds specified as `timeout` parameter. If `timeout` is NULL, the default value we decide is 5 (please, use the value you think is most appropriate in your situation). We could define two default values, one in dyn_cursor() and one in dyn_lock(). Later, the reason for this will be clear.

If the lock can not be acquired, dyn_lock() throws an error. The SQLSTATE '45000' is the one suggested by the MySQL documentation for user errors. The message is not complete, because this is just an example. But it should contain the name of the Stored Routine that thrown it, if you want make debug easier.

Then, dyn_lock() creates a view with the specified name. And then the execution exits from dyn_lock(). Here we open the cursor (alas, this can’t be done within dyn_lock() because cursors exist at a Routine level), and we release the lock using RELEASE_LOCK().

Let’s talk about the view. Its AS clause is the statement passed through `query` parameter. So, doing a SELECT * on that view is like directly running `query`. The performances are the same, or almost the same. The advantage is that SELECT * FROM `dyn_view` can be hard-coded in a cursor definition (we did it in dyn_cursor()), but `query` is a string that can be generated dynamically.

Views can not be temporary in MariaDB/MySQL (they can in Postgres). This means that when you create/replace a view, other connections could use it. In our case, we must prevent a concurrency like this:
1) A creates `dyn_view`;
2) Just 0.0000001 seconds after, B replaces `dyn_view`;
3) A executes the query on `dyn_view`, knowing not that it has been replaced.
To avoid this, we use the lock I described.

We want to release the lock as soon as possible, to increase performances. We can do this just after opening the cursor. It will work on a query, not on the view itself – so, we don’t need it anymore, and can allow other connections to replace it. In fact, the view is always created (if it doesn’t still exists) or replaced. It is never drop, to speed up things.

It is important that the view and the lock have the same name and are created within dyn_lock(). This way, dyn_lock() contains (part of) the generalized code. Not only dyn_cursor(), but also other Stored Routines, can call dyn_cursor() to get a lock and a view with a given name. If several Routines call dyn_cursor() passing the the same name, they can’t be executed at the same time. For the sake of brevity, I will not show here this case here, but it can be useful. If different Routines call dyn_cursor() with different names, they can be executed at the same time. But still two connections can not execute the same Routine at the same time. However, it is important that Routines release the lock name they acquired as soon as possible (again, see RELEASE_LOCK()).

To Do (or not To Do)

GET_LOCKS() accepts a timeout in seconds, not microseconds. For short queries, you may want to specify a value like 0.5. Or 1.5. If this limitation is a problem for you, you may consider to re-implement dyn_lock() and write an unlocking PROCEDURE. You could acquire locks by inserting your CONNECTION_ID() into a MEMORY (but not temporary!) table, and release the lock by deleting that ID. To check if a lock has already been acquired by someone else, you could use a SELECT ... FOR UPDATE. But there is a complication: a query could be KILLed before it releases its lock. So, you also need to write in the table a TIMESTAMP that indicates when the lock has been acquired. After a given timeout (which can be constant or variable), the lock expires.

If you re-implement the locking system, and you have a high concurrency, you may also consider to use more than one view. The SELECTs on the view must be hard-coded, so the number of allowed views and their names must be hard-coded too. But you could allow 3 view called dyn_view1, dyn_view2, dyn_view3, and create 3 locks with the same names in the MEMORY table. But if you use the GET_LOCK() and RELEASE_LOCK() functions, you need to use only 1 lock, and thus only 1 view.

Please, note that re-implementing the locking system will improve concurrency and timeout granularity, but will add complexity, and thus should reduce the performances in an environment where concurrency is almost irrilevant.

For the sake of brevity, errors generated by the Dynamic SQL are not handled here. But you should handle them.

Limitations and Feature Requests for MariaDB

* WARNING: because of a bug, in MySQL you must avoid to pass negative numbers to GET_LOCK() as timeout, because it is an endless timeout. In MariaDB, this only happens if you pass very low negative numbers (MDEV-4017).
* First of all: if MariaDB adds support for cursors on dynamic SQL statements, all this code will no longer be necessary.
* LIMITATION: This technique can not be used recursively – that is, the query you pass to dyn_cursor() can not contain a call to dyn_cursor(). This is because dyn_lock() uses a Prepared Statement. MariaDB/MySQL’s Prepared Statements exist at a connection-level, and are not recursive. I hope that this in the future we will have a way to implement this type of recursivity.
* #MDEV-4028 GET_LOCK() should accept a timeout argument with granularity in microseconds.
* Dynamic SQL could generate errors or warnings. In that case, it would be useful to get an error message specifying the name of the Prepared Statement and the Stored Routine that thrown it.
* If a Stored Routine could return a cursor or a result set, we could move the OPEN and DO RELEASE_LOCK() lines into dyn_lock(). If we could do this, the code would be easier to maintain.

I use a lot of Stored Programs, and I am working (when I have the time to do it) on a couple of libraries I would like to release publicly. But there are strong limitations on what you can do with Routines, at the moment. I will write a post about what features I think that could make a developer’s life easier. Maybe some of them seem useless, until one starts to use Routines in real life – then, you understand that they could help you writing better (mantainable, general, faster) code. And maybe some of them are not so difficult to implement… at least, I hope so.

See also:

MariaDB/MySQL Cursors: a brief Tutorial

Enjoy!

FALSE IS NOT NULL

Yet Another Blog About Programming (YABAP)

Monthly Archives: January 2013

What if root can’t CREATE or DROP a user?

MariaDB/MySQL: Cursors for Dynamic SQL