Setting a strict SQL_MODE

Posted on 2013/03/10 by Federico Razzoli

Developing Stored Routines, Triggers or Events for MariaDB and MySQL is not easy, because the language is not flexible and has a lot of limitations. Plus, the server tries to make things easier by hiding out errors and allowing bad practice. In my opinion, when you have to write at least 3 lines, bad practices make things esponentially harder. When the server complains you did something wrong, the problem is not in the server: it is in your code! And it remains even if you tell the server to shut down. The only solution is finding the problem and modifying your code. So, I suggest you make the server stricter. To do this, you have to modify the SQL_MODE variable.

The SQL_MODE can be used to change the SQL syntax, and make it more compatible with another DBMS. I don’t like to use syntaxes which are not the de-facto standard, but I think it is harmless.

But some flags should always be set, because they prevent the server to be too much flexible.

ERROR_FOR_DIVISION_BY_ZERO
Numbers can not be divided by zero, because the result is undefined. When you try to divide a number by zero, languages usually raise an error. And so does MariaDB, if this flag is set. Otherwise, it returns NULL. As NULL is logically correct when you want to represent an undefined value, this may be good for you. But usually it isn’t – if you are dividing a number by zero by mistake, you want to know and correct the bug!

NO_AUTO_CREATE_USER
When you GRANT rights to a user which doesn’t exist, it is created automatically, unless this flag is set. It’s not a good thing for security that you can create a user by mistake.

NO_AUTO_VALUE_ON_ZERO
This flag allowes you to insert a 0 value into an AUTO_INCREMENT field. Since 0 is a legal value, this flag should be set. But I don’t care too much about it, because changing an AUTO_INCREMENT field is a bad practice too.

NO_ENGINE_SUBSTITUTION
When you create a table, it is important to choose the right Storage Engine. If this flag is not set, and you specify a Storage Engine which doen’t exist, the server silently uses the default.

NO_ZERO_DATE
If this flag is not set, MariaDB accepts a special date value: ‘0000-00-00’. If you want to use it, you don’t need to set this flag; but it’s quite a rare situation. Otherwise it’s only a source of problems, because invalid dates are silently converted to a zero-date. Which is almost never what you want.

NO_ZERO_IN_DATE
This flag prevents non-zero dates which have zero-parts, like ‘0000-05-20’. It is a bit harder to do this by mistake, but it can happen if you compose a date string dinamically.

ONLY_FULL_GROUP_BY
When you specify a non-aggregate and non-grouped column in the SELECT clause, the result is undefined. For this reason an error should be raised, but this only happens if this flag is set.

STRICT_ALL_TABLES
This flag is very important, because if it is not set, no error is raised when you try to assign an out-of-range value (or a string which is too long). It works not only for tables, but also for variables and functions. For example, if a TINYINT SIGNED function tries to return 200, no error is raised (just a warning), and the return value is 127.

STRICT_TRANS_TABLES
See STRICT_ALL_TABLES, but this flag only works for transactional tables.

So, here’s the SQL_MODE I decided to use when I create a Stored Routine for the STK proects:

SET @@session.SQL_MODE = 'ERROR_FOR_DIVISION_BY_ZERO,NO_ZERO_DATE,NO_ZERO_IN_DATE,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION,ONLY_FULL_GROUP_BY,STRICT_ALL_TABLES,STRICT_TRANS_TABLES';

Here you can find a complete list of all MySQL SQL_MODE flags.

Setting the InnoDB’s Strict Mode is also a good idea, but it’s out of the purpose of this post.

Enjoy!

Validating data using a TRIGGER

Posted on 2012/11/06 by Federico Razzoli

Versione italiana

I wrote a post about how to emulate the CHECK clause in MariaDB and MySQL using a VIEW. Here is a different approach: this time, we will use TRIGGERs to reach the same goal. And then we will discuss which approach is better.

Let’s CREATE what we need

Let’s CREATE a simple table:

-- make sure that `person` can be created
DROP VIEW   IF EXISTS `person`;
DROP TABLE  IF EXISTS `person`;

-- table whose data must be validated
CREATE TABLE `person`
(
	`id`  mediumint UNSIGNED NOT NULL AUTO_INCREMENT,
	`name`  char(50) NOT NULL,
	`email`  char(80) NOT NULL,
	PRIMARY KEY (`id`)
)
	ENGINE = Aria;

`person` contains a Primary Key, a field called `name`, and `email`. We will just care about the `email`. We want to validate it using a LIKE pattern. But how?

Let’s CREATE the TRIGGER

TRIGGERs can be set on the base DML operations (INSERT, UPDATE, DELETE). If we want to be sure that data are valid, we should use both INSERT and UPDATE TRIGGERs, but for brevity here I will just show the INSERT TRIGGER. TRIGGERs can be executed BEFORE or AFTER the operation they are associated to; of course, if we want to prevent invalid data to be written, we need the BEFORE option.

-- validate email
CREATE TRIGGER `person_validate_insert`
	BEFORE INSERT
	ON `person`
	FOR EACH ROW
BEGIN
	IF NEW.`email` NOT LIKE '%_@%_.__%' THEN
		SIGNAL SQLSTATE VALUE '45000'
			SET MESSAGE_TEXT = '[table:person] - `email` column is not valid';
	END IF;
END;

TRIGGERs can’t have a comment, so I always try to use descriptive names. The first word is person, which is the name of the table; then, I explain what the TRIGGER does: ‘validate insert’.

The TRIGGER checks if the new `email` is valid. If it is, execution continues and the INSERT can be performed. But if it is not, an error is thrown. This stops the execution, and the INSERT will not be performed.

Just a note about the error. The SQLSTATE is ‘45000’, as suggested by MySQL’s documentation – user-defined errors should set SQLSTATE to ‘45000’. The message starts with an info wrapped between the [square brackets]. When I throw an error from a Stored Routine, I write the name of the routine; when I throw it from a TRIGGER, I write the table name. Then, there is a descriptive error message. “Descriptive” means that the user is told what he was trying to do (INSERT) and why the operation aborted (`email` column is not valid).

Let’s test!

I assume that everything I don’t test (if it can be tested), simply does not work. So, let’s INSERT a valid row:

MariaDB [test]> INSERT INTO `person`
    ->          (`name`, `email`)
    ->  VALUES
    ->          ('Emiliano Zapata', 'zapata@revolucion.mx');
Query OK, 1 row affected (0.59 sec)

Great! Now, let’s see what happens when we try to INSERT invalid data:

MariaDB [test]> INSERT INTO `person`
    ->          (`name`, `email`)
    ->  VALUES
    ->          ('John Doe', 'misterdoe@nowhere');
ERROR 1644 (45000): [table:person] - `email` column is not valid

Great! That’s exactly the error we expected to see.

What is the best technique?

Now that I discussed two different techniques (read how to use a VIEW), I am supposed to explain which one is better. But I’ll just write some notes, that should clarify the pro’s and the con’s of each method:

* If you are using MariaDB 5.3 or older, or MySQL 5.4 or older, the SIGNAL statement is not avaible. You can use a SELECT on a table which does not exists, which is a bit less clean (you know why you see that error, but it can be confusing for other developers).
* Hosting companies often do not allow you to use TRIGGERs – in that case, you can only use the VIEW technique.
* The INSERT/UPDATE‘s performance should not be notably affected by this choice. DELETE and SELECT performance may be a little bit slower for the VIEW method.
* The TRIGGER method needs 2 different triggers (on insert, on update; maybe on delete, if you want to check data on other tables to ensure database’s consinstency). The VIEW method needs a view and renamed table. The second option is probably more convenient… but this is TRUE for me, and may be FALSE for you. (a boolean way to say that it’s subjective)
* This also means that the TRIGGER method requires duplicated code (same statements in INSERT and in UPDATE TRIGGERs). You can avoid this issue by using a Stored Routine, but it is slower.

Since there is not an important difference in performances or ease of mantainence of these methods, the choice is up to you.

Enjoy!

Emulating the CHECK clause in MariaDB

Posted on 2012/10/10 by Federico Razzoli

Versione italiana

MariaDB (like MySQL and its forks) does not have a CHECK clause in the CREATE TABLE statement. But you can emulate it using VIEWs.

Data Definition

First, let’s create a table:

CREATE TABLE `t_person`
(
	`id`       INTEGER UNSIGNED NOT NULL AUTO_INCREMENT ,
	`name`     CHAR(30) NOT NULL ,
	`surname`  CHAR(30) NOT NULL ,
	`email`    CHAR(50) NOT NULL ,
	PRIMARY KEY (`id`)
);

The table is ready to store our data, but we need to check that data is valid. When new row is inserted, we want to check that `surname` contains at least 2 characters and `email` contains ‘@’ and ‘.’ characters.

So, let’s create a view that would show only data that follow these simple rules:

CREATE VIEW `person`
	AS SELECT * FROM `t_person`
		WHERE LENGTH(`surname`) > 1
			AND `email` LIKE '%_@_%._%'
	WITH CHECK OPTION;

The view is INSERTable, because it is based on only one basetable and has no grouping functions/clauses. The WITH CHECK OPTION clause allow new rows to be inserted only if they satisfy the view’s WHERE condition.

Data Manipulation

Now, you can insert data through the view. Let’s start with an invalid row:

INSERT
	INTO `person`
		(`name`, `surname`, `email`)
	VALUES
		('Mario', 'Rossi', 'mario.rossi');

`email` does not contain a ‘@’, so the row is rejected with a good error message:
[Err] 1369 - CHECK OPTION failed 'test.person'

So, let’s try to insert a valid row:

INSERT
	INTO `person`
		(`name`, `surname`, `email`)
	VALUES
		('Mario', 'Rossi', 'mario.rossi@rossionline.it');

The row is successfully inserted!

You can SELECT your data from `t_person` or from `person`, as you prefer. If you use `person`, than you (or other developers) can pretend that the view is a table and `t_person` does not exists – it’s transparent. However, SELECTing rows from `person` adds a useless WHERE that is always true but affects the performances.

Enjoy!

FALSE IS NOT NULL

Yet Another Blog About Programming (YABAP)

Tag Archives: data_integrity

Setting a strict SQL_MODE

Validating data using a TRIGGER

Let’s CREATE what we need

Let’s CREATE the TRIGGER

Let’s test!

What is the best technique?

Emulating the CHECK clause in MariaDB

Data Definition

Data Manipulation

See also: