MariaDB’s Sequence Storage Engine

Versione italiana

MariaDB 10.0.3 introduces a new Storage Engine: Sequence. It isn’t in MySQL or in older MariaDBs versions. Sequence is a special engine, which does not create or read any table. It only generates on the fly a sequence of integers which is returned to your query, and then dies. The sequence’s bounds and increment depend from the table name.

Very quick start

To install:

INSTALL PLUGIN sequence SONAME 'ha_sequence';

NOTE: On MariaDB (unlike Oracle MySQL) there is no need to add ‘.dll’ to the command on Windows, so the command is platform-independent. Thanks Vladislav Vaintroub for this information!

To create a sequence, you must first select a default database, using the USE command. Then issue this query:

SELECT * FROM seq_1_to_100;

To create a sequence specifying an increment different from 1:

SELECT * FROM seq_1_to_100_step_2;

A sequence can’t involve negative numbers, but can be descending:

SELECT * FROM seq_10_to_1;

Please, note that descending sequences always generate the same number as ascending sequence, in reverse order. This means that the highest number of a descending sequence sometimes is not the one you typed:

MariaDB [test]> SELECT * FROM seq_8_to_1_step_3;
+-----+
| seq |
+-----+
|   7 |
|   4 |
|   1 |
+-----+
3 rows in set (0.00 sec)

What sequence is not

Sequence Storage Engine is not an SQL SEQUENCE like the ones you can use in PostgreSQL or IBM, or FirebirdSQL’s generators. This means that a sequence exists only during the execution of a query. You cannot use a value from a sequence in a query and then use the next value in another query.

Also, sequence can’t generate negative numbers and can’t rotate when it reaches the maximum/minimum value (like they do in PostgreSQL).

Some sequence use cases

Other people don’t believe it, but developers have a lot of fantasy. It is needed to solve problems when there isn’t an obvious solution, and can’t find an algorithm designed by someone else. So, I’m sure that creative uses of the sequence Storage Engine will be found in the future. But for now, I’d like to share some ideas.

Almost all the following techniques can be used to populate tables with generated data. This can be useful, because when it can be done using some queries, you avoid network communications – which can be slow or error prone.

Find holes in a column

Sometimes a column should contain a sequence of integer values and holes shouldn’t be there. A common example is a column used to sort some elements in a web page: while holes shouldn’t be a problem for queries, they can make re-sorting operations buggy. In the following example we’ll create a table with a lot of holes, and then we’ll list the missing values using a sequence:

CREATE TABLE t1 (c TINYINT UNSIGNED) ENGINE=InnoDB;
INSERT INTO t1 VALUES (1), (2), (3), (5), (10);
SELECT s.seq FROM seq_1_to_10 s LEFT JOIN t1 t ON s.seq = t.c WHERE t.c IS NULL;

Build a combination of numbers

SELECT s1.seq, s2.seq FROM seq_1_to_3 s1 JOIN seq_1_to_3 s2 ORDER BY 1, 2;

But you can also combinate numbers and characters (see below).

Find multiples of 3, minor than 100

SELECT seq FROM seq_3_to_100_step_3;

Find multiples of both 3 and 5

SELECT s1.seq FROM seq_5_to_100_step_5 s1 INNER JOIN seq_3_to_100_step_3 s2 ON s1.seq = s2.seq;

Optimized: find multiples of (2 or one of its powers)

SELECT seq FROM seq_1_to_100 WHERE NOT seq & 1; -- ...of 2
SELECT seq FROM seq_1_to_100 WHERE NOT seq & 3; -- ...of 4
SELECT seq FROM seq_1_to_100 WHERE NOT seq & 7; -- ...of 8
SELECT seq FROM seq_1_to_100 WHERE NOT seq & 15; -- ...of 16
...

Build a sequence of 1-char strings

The following query includes lowercase chars, uppercase chars and digits. But I’m sure that anyone can modify it to exlude one of these groups, or include more groups (for example, accented vowels).

-- Sequence of 1-char strings
SELECT CHAR(seq) AS ch
    FROM (
                -- lowercase
                (SELECT seq FROM seq_97_to_122 l)
            UNION
                -- uppercase
                (SELECT seq FROM seq_65_to_90 u)
            UNION
                -- digits
                (SELECT seq FROM seq_48_to_57 d)
        ) ch;

Build a sequence of 2-char strings

Ouch… this UNION is tricky. But someone could still use it.

SELECT CONCAT(ch1.ch1, ch2.ch2) AS ch
    FROM (
        (SELECT CHAR(seq) AS ch1
            FROM (
                        -- lowercase
                        (SELECT seq FROM seq_97_to_122 l1)
                    UNION
                        -- uppercase
                        (SELECT seq FROM seq_65_to_90 u1)
                    UNION
                        -- digits
                        (SELECT seq FROM seq_48_to_57 d1)
                ) s1
        )
    ) ch1
    CROSS JOIN (
        (SELECT CHAR(seq) AS ch2
            FROM (
                        -- lowercase
                        (SELECT seq FROM seq_97_to_122 l2)
                    UNION
                        -- uppercase
                        (SELECT seq FROM seq_65_to_90 u2)
                    UNION
                        -- digits
                        (SELECT seq FROM seq_48_to_57 d2)
                ) s2
        )
    ) ch2
    ORDER BY ch1, ch2;

Sorry for the aliases confusion (particularly ch1 and ch2). I ran out of fantasy before finishing the query.

Sequence of dates

This is interesting, in my opinion. Many booking applications, during the installation, populate a table with days or hours when a reservation could start. This example shows a simple way to fill a table with all days in January:

SELECT DATE ('2014.01.01' + INTERVAL (s.seq - 1) DAY) AS d
    FROM (SELECT seq FROM seq_1_to_30) s;

Sequence of hours, halfes of an hour, etc

-- Hours in a day
SELECT CAST('00:00:00' AS TIME) + INTERVAL (s.seq - 1) HOUR AS t
    FROM (SELECT seq FROM seq_1_to_24) s;
-- Halfes of an hour in a day
SELECT CAST('00:00:00' AS TIME) + INTERVAL (30 * s.seq) MINUTE AS t
    FROM (SELECT seq FROM seq_1_to_48) s;

Working days

The last two examples are not so useful, alone. Usually you don't need to fill a table with all days in a month/year, or all hours in a day. Instead, you only need to write working days and working hours. To find working days, you can use DAYOFWEEK() function to exclude saturdays and sundays. Of course this doesn't automatically exclude christmas or other holidays - but I think that the best solution is deleting them after filling the table.

SELECT DATE ('2014-01-01' + INTERVAL (s.seq - 1) DAY) AS d
    FROM (SELECT seq FROM seq_1_to_30) s
    -- exclude sunday (1) and saturday (7)
    WHERE DAYOFWEEK(DATE ('2014-01-01' + INTERVAL (s.seq - 1) DAY)) BETWEEN 2 AND 6;

Just a note on implementation. Why do I start dates from day 1 and then I perform a - 1, rather than directly starting from day 0? For two reasons. First, in Setting a strict SQL_MODE article, I explain that the NO_ZERO_DATE and NO_ZERO_IN_DATE flags in SQL_MODE should be set. Second: '2014-01-00' + INTERVAL 1 DAY returns NULL, and it is the correct behavior, not a bug. If you don't understand the reason (but also if you understand it), please set those flags. Permissive SQL_MODEs pretend to be your friends, but one of these days they will kill you!

Working hours

SELECT {dt '2013-01-01 01:00:00'} + INTERVAL (wd.d - 1) DAY + INTERVAL (dh.h - 1) HOUR AS wh
    FROM
        (
            -- working days in a month
            SELECT seq AS d FROM seq_1_to_30
        ) wd
    CROSS JOIN
        (
            -- daily working hours
            (SELECT seq AS h FROM seq_9_to_12)
            UNION
            (SELECT seq AS h FROM seq_14_to_17)
        ) dh
    ORDER BY 1;

Here we have one subquery returning the working days, JOINed with a subquery returning working hours. Note that working hours have a hole (workers lunch), so the second subquery is a UNION of 2 sequences.

And... yes, carthesian product is evil, but here it's exactly what we need. The CROSS keyword makes clearer that if you modify the query and make a mistake, you could create a SELECT returning a big number of rows.

To-Do and suggestions

In a future article, I will show a Stored Function which dynamicall composes a query to populate a table with DATETIME values, in a given time range, with a given granularity.

Do you think that one of these query is not well-optimized? Do you have other interesting ideas which involve the sequence Storage Engine? Please, comment this article.

Enjoy!

Advertisement

8 thoughts on “MariaDB’s Sequence Storage Engine

  1. Hello! If you want to use the final count value as a variable parameter, use something like this:

    … FROM (SELECT seq FROM seq_0_to_999999 WHERE seq < 30752) s

    It works well fast, like sets of static strings. Yes, the design is so-so, but usable.

  2. Pingback: FIlling in missing rows in aggregate reports using cardinal numbers

  3. This seems like a really ugly workaround for the fact that MySQL/MariaDB doesn’t support functions that can return a result set.

  4. This is incredibly useful. I’ve always kept a table NUMBERS that I initialized with 0 to 65535, to serve this purpose.
    I assume SEQUENCE is more efficient, although I wonder how it interacts with the query optimizer?

    • Yes, it’s faster because you don’t need to access disk. As far as I understand, it doesn’t need to interact with the optimizer; do you have any example? If you have, I can check for you.

  5. Pingback: MariaDB 10.0 Beta launched – an important milestone « The MariaDB Blog

Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s