Just a Theory

By David E. Wheeler

Posts about Database Triggers

Issues with INSERT Execution Ordering

My latest experiments in my never-ending quest to move as much object/relational mapping into the database as possible has yielded more hacks to work around database features. This time, the problem is that I have two classes, Simple and Extend, where I want the latter to extend the former (hence its name). This is a little different from inheritance, in that, internally, an Extend just references a single Simple object, but externally, it just has a single interface, where attributes of a Simple object are just attributes of an Extend object. The benefit here is that I can have multiple Extend objects that reference the same Simple object—something you can’t do with simple inheritance

Now, how I wanted to implement this in the database is where the extend view has all of the columns from the _simple table and the _extend table. That’s pretty simple. It looks like this:

CREATE SEQUENCE seq_kinetic;

CREATE TABLE _simple (
    id INTEGER NOT NULL DEFAULT NEXTVAL('seq_kinetic'),
    uuid UUID NOT NULL DEFAULT UUID_V4(),
    state INTEGER NOT NULL DEFAULT 1,
    name TEXT NOT NULL,
    description TEXT
);

CREATE TABLE _extend (
    id INTEGER NOT NULL DEFAULT NEXTVAL('seq_kinetic'),
    uuid UUID NOT NULL DEFAULT UUID_V4(),
    state INTEGER NOT NULL DEFAULT 1,
    simple_id INTEGER NOT NULL
);

CREATE VIEW extend AS
  SELECT _extend.id AS id, _extend.uuid AS uuid, _extend.state AS state,
         _extend.simple_id AS simple__id, simple.uuid AS simple__uuid,
         simple.state AS simple__state, simple.name AS name,
         simple.description AS description
  FROM   _extend, simple
  WHERE  _extend.simple_id = simple.id;

Pretty simple, right? Well, I like to put RULEs on VIEWs like this, so that I can just use the VIEW for INSERTs, UPDATEs, AND DELETESs. For now I’m just going to talk about the INSERT RULEs for this this view, as they are where the trouble came in.

Why the plural, RULEs? Well, what I wanted was to have two behaviors on insert, depending on the value of the simple__id column. If it’s NOT NULL, it should assume that it references an existing record in the _simple table, UPDATE it, and then INSERT into the _extend table. If it’s NULL, however, then it should INSERT into both the _simple table and the _extend table. What I came up with looked like this:

CREATE RULE insert_extend AS
ON INSERT TO extend WHERE NEW.simple__id IS NULL DO INSTEAD (
  INSERT INTO _simple (id, uuid, state, name, description)
  VALUES (NEXTVAL('seq_kinetic'), UUID_V4(), NEW.simple__state, NEW.name,
          NEW.description);

  INSERT INTO _extend (id, uuid, state, simple_id)
  VALUES (NEXTVAL('seq_kinetic'), COALESCE(NEW.uuid, UUID_V4()), NEW.state,
          CURRVAL('seq_kinetic'));
);

CREATE RULE extend_extend AS
ON INSERT TO extend WHERE NEW.simple__id IS NOT NULL DO INSTEAD (
  UPDATE _simple
  SET    state = COALESCE(NEW.simple__state, state),
         name  = COALESCE(NEW.name, name),
         description = COALESCE(NEW.description, description)
  WHERE  id = NEW.simple__id;

  INSERT INTO _extend (id, uuid, state, simple_id)
  VALUES (NEXTVAL('seq_kinetic'), COALESCE(NEW.uuid, UUID_V4()),
          NEW.state, NEW.simple__id);
);

CREATE RULE insert_extend_dummy AS
ON INSERT TO extend DO INSTEAD NOTHING;

That third RULE is required, as a VIEW must have an unconditional DO INSTEAD RULE. The second RULE also works, for those situations where I want to to “extend” a Simple object into an Extend object. It’s that first rule that’s causing problems, when no Simple object yet exists and I need to create it. When I try to INSERT with simple__id set to NULL, I get an error. Can you guess what it is? Let me not keep you on the edge or your seat:

kinetic=# INSERT INTO EXTEND (simple__id, name) VALUES (NULL, 'Four');
ERROR:  insert or update on table "_extend" violates foreign key constraint "fk_simple_id"
DETAIL:  Key (simple_id)=(22) is not present in table "_simple".

Why’s that? Well, it turns out that NEXTVAL('seq_kinetic') is executed twice, once for the id in the _simple table, and once for the id in the _extend table. But by the time CURRVAL('seq_kinetic') is called to reference the value inserted into the _simple table, it the value has already been fetched from the sequence for insertion into the _extend table. So of course, it fails, because the current value in the sequence is not in the _simple table at all.

At first, I thought that this might be an order of execution problem with the INSERT statement, so I tried this:

CREATE RULE insert_extend AS
ON INSERT TO extend WHERE NEW.simple__id IS NULL DO INSTEAD (
  INSERT INTO _simple (id, uuid, state, name, description)
  VALUES (NEXTVAL('seq_kinetic'), UUID_V4(), NEW.simple__state, NEW.name,
          NEW.description);

  INSERT INTO _extend (simple_id, id, uuid, state)
  VALUES (CURRVAL('seq_kinetic'), NEXTVAL('seq_kinetic'),
          COALESCE(NEW.uuid, UUID_V4()), NEW.state);
);

Unfortunately, that yielded the same error. So if the order of the columns in the INSERT statement didn’t define the execution order, what did? Well, a little research helped me to figure it out: It’s the order of the columns in the table, as this example demonstrates:

test=# CREATE SEQUENCE s;
CREATE SEQUENCE
test=# CREATE TABLE a (a0 int, a1 int, a2 int, a3 int);
CREATE TABLE
test=# INSERT INTO a (a3, a2, a0, a1) VALUES (NEXTVAL('s'), NEXTVAL('s'),
test=# NEXTVAL('s'), NEXTVAL('s'));
INSERT 0 1
test=# SELECT * FROM a;
 a0 | a1 | a2 | a3 
----+----+----+----
  1 |  2 |  3 |  4
(1 row)

Even though the values from the sequence were inserted into the columns by the INSERT statement in a more or less random order, they ended up being inserted into the table in the order in which they were declared in the CREATE TABLE statement.

Damn SQL!

So what’s the solution to this? Well, I came up with three. The first, and perhaps simplest, is to use two sequences instead of one:

CREATE RULE insert_extend AS
ON INSERT TO extend WHERE NEW.simple__id IS NULL DO INSTEAD (
  INSERT INTO _simple (id, uuid, state, name, description)
  VALUES (NEXTVAL('seq_kinetic'), UUID_V4(), NEW.simple__state, NEW.name,
          NEW.description);

  INSERT INTO _extend (id, uuid, state, simple_id)
  VALUES (NEXTVAL('seq_kinetic_alt'), COALESCE(NEW.uuid, UUID_V4()), NEW.state,
          CURRVAL('seq_kinetic'));
);

This works very well, and if you have separate sequences for each table, this is what you would do. But I want to use just one sequence for every primary key in the database, so as to prevent any possibility of duplicates. I could use two mutually exclusive sequences, one for odd numbers and the other for even numbers:

CREATE SEQUENCE seq_kinetic_odd INCREMENT BY 2;
CREATE SEQUENCE seq_kinetic_odd INCREMENT BY 2 START WITH 2;

But then I have to keep track of which sequence I’m using where. If I just use the “even” sequence for this special case (which may be rare), then I’m essentially throwing out half the numbers in the sequence. And I like things to be somewhat orderly, and the skipping of even or odd values would annoy me when I had to work with the database. Yeah, I’m a freak.

The solution I’ve currently worked out is to create a PL/pgSQL function that can keep track of the sequence numbers ahead of time, and just call it from the RULE:

CREATE FUNCTION insert_extend(NEWROW extend) RETURNS VOID AS $$
    DECLARE
        _first_id  integer := NEXTVAL(''seq_kinetic'');
        _second_id integer := NEXTVAL(''seq_kinetic'');
    BEGIN
    INSERT INTO _simple (id, uuid, state, name, description)
    VALUES (_first_id, UUID_V4(), NEWROW.simple__state, NEWROW.name,
            NEWROW.description);

    INSERT INTO _extend (id, uuid, state, simple_id)
    VALUES (_second_id, COALESCE(NEWROW.uuid, UUID_V4()), NEWROW.state, _first_id);
    END;
$$ LANGUAGE plpgsql VOLATILE;
    
CREATE RULE insert_extend AS
ON INSERT TO extend WHERE NEW.simple__id IS NULL DO INSTEAD (
    SELECT insert_extend(NEW);
);

This approach works pretty nicely, and doesn’t add much more code than my original solution with the ordering problem. I think I’ll keep it.

One other solution is to use a TRIGGER instead of a rule, but in truth, it would amount to nearly the same thing:

CREATE FUNCTION insert_extend() RETURNS trigger AS $$
    DECLARE
        _first_id  integer := NEXTVAL(''seq_kinetic'');
        _second_id integer := NEXTVAL(''seq_kinetic'');
    BEGIN
    INSERT INTO _simple (id, uuid, state, name, description)
    VALUES (_first_id, UUID_V4(), NEW.simple__state, NEW.name, NEW.description);

    INSERT INTO _extend (id, uuid, state, simple_id)
    VALUES (_second_id, COALESCE(NEW.uuid, UUID_V4()), NEW.state, _first_id);
    END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER insert_extend BEFORE UPDATE ON extend
FOR EACH ROW EXECUTE PROCEDURE insert_extend();

Um, but looking at it now (I just now typed it up, I haven’t tested it), I don’t think it’d work, because you can’t put a condition on a rule. On the other hand, I could use it to combine the three rules I have (two conditional and mutually exclusive, one that does nothing) into a single trigger:

CREATE FUNCTION insert_extend() RETURNS trigger AS $$
    DECLARE
        _first_id  integer;
        _second_id integer;
    BEGIN
    IF NEW.simple__id IS NULL THEN
        _first_id  := NEXTVAL(''seq_kinetic'');
        _second_id := NEXTVAL(''seq_kinetic'');

        INSERT INTO _simple (id, uuid, state, name, description)
        VALUES (_first_id, UUID_V4(), NEW.simple__state, NEW.name, NEW.description);

        INSERT INTO _extend (id, uuid, state, simple_id)
        VALUES (_second_id, COALESCE(NEW.uuid, UUID_V4()), NEW.state, _first_id);
    ELSE
        UPDATE _simple
        SET    state = COALESCE(NEW.simple__state, state),
                name  = COALESCE(NEW.name, name),
                description = COALESCE(NEW.description, description)
        WHERE  id = NEW.simple__id;

        INSERT INTO _extend (id, uuid, state, simple_id)
        VALUES (NEXTVAL('seq_kinetic'), COALESCE(NEW.uuid, UUID_V4()),
                NEW.state, NEW.simple__id);
    END IF;
    END;
$$ LANGUAGE plpgsql;

Hrm. That just might be the best way to go, period. Thoughts? Have I missed some other obvious solution?

Looking for the comments? Try the old layout.

What Advanced SQL Book Should I Buy?

So, what advanced SQL book should I buy? I’ve learned a lot about SQL over the last year or so, but I’m sure that Josh Berkus is tired of being my own personal advanced SQL reference. So I’d like to really learn more about triggers, stored procedures, rules, views, and whatnot, what they’re best used for and when to use them. And other typical database features that I’m not familiar with, of course.

What I don’t need is an introduction to SQL. There are a million of those, and they all have much the same stuff. I want to really get into advanced concepts.

So what’s the best choice? Leave me a comment with your opinion. Thanks!

Looking for the comments? Try the old layout.

How to Ensure Unique Values Relative to a Column in Another Table

I recently came across a need to ensure that the value of a column in one table is unique relative to the value of a column in another table. This came about through my use of views to represent object-oriented inheritance relationships in PostgreSQL. For example, say I have a parent class, “Person”, and its subclass, “User.” The person table is quite straight-forward:

CREATE TABLE person (
    id INTEGER NOT NULL PRIMARY KEY SERIAL,
    first_name TEXT,
    last_name  TEXT,
    state      INTEGER
);

I use a combination of a table and a view to represent the User class while keeping all of the data nicely denormalized:

CREATE TABLE person_usr (
    id INTEGER NOT NULL PRIMARY KEY REFERENCES person (id),
    username TEXT,
    password TEXT
);

CREATE VIEW usr AS
SELECT p.id AS id, p.first_name AS first_name, p.last_name AS last_name,
       p.state AS state, u.username AS username, u.password AS password
  FROM   person p, usr u
 WHERE  p.id = u.id;

So to maintain things, I write rules on the usr view to execute INSERT, UPDATE, and DELETE queries against the person and person_usr tables as appropriate.

Now, say that I have a business requirement to allow there to be duplicate usernames for users provided that only one is not “deleted.” Whether a user object is active, inactive, or deleted is determined by the value in its state attribute, which is stored in the person table. The value of the state attribute can be “1” for active, “0” for inactive, and “-1” for deleted. So how can I ensure, in the database, that this rule is followed?

Well, if the state and username columns were in a single table, this is very easy in PostgreSQL: just create a partial unique index:

CREATE UNIQUE INDEX udx_usr_unique
    ON usr(username)
 WHERE state > -1;

This does the trick beautifully, and is nice and compact. However, my OO design has the User class inheriting from Person, so I have the username column in one table and the state column in another. At first, I thought that I could still use a partial unique index, something like this:

CREATE UNIQUE INDEX udx_usr_unique
    ON usr(username)
 WHERE id IN (SELECT id FROM person WHERE state > -1);

Unfortunately, as of PostgreSQL 8.1, the PostgreSQL documentation states:

The expression used in the WHERE clause may refer only to columns of the underlying table, but it can use all columns, not just the ones being indexed. Presently, subqueries and aggregate expressions are also forbidden in WHERE. The same restrictions apply to index fields that are expressions.

D’oh!

So I had to figure out another method. CHECK constraints cannot reference another table, either. So I was left with triggers. It’s ugly and verbose, but it appears to do the trick. Here is the recipe:

CREATE FUNCTION cki_usr_username_unique() RETURNS trigger AS $$
    BEGIN
    /* Lock the relevant records in the parent and child tables. */
    PERFORM true
    FROM    person_usr, person
    WHERE   person_usr.id = person.id AND username = NEW.username FOR UPDATE;
    IF (SELECT true
        FROM   usr
        WHERE  id <> NEW.id AND username = NEW.username AND usr.state > -1
        LIMIT 1
    ) THEN
        RAISE EXCEPTION ''duplicate key violates unique constraint "ck_person_usr_username_unique"'';
    END IF;
    RETURN NEW;
    END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER cki_usr_username_unique BEFORE INSERT ON person_usr
    FOR EACH ROW EXECUTE PROCEDURE cki_usr_username_unique();

CREATE FUNCTION cku_usr_username_unique() RETURNS trigger AS $$
    BEGIN
    IF (NEW.username <> OLD.username) THEN
        /* Lock the relevant records in the parent and child tables. */
        PERFORM true
        FROM    person_usr, person
        WHERE   person_usr.id = person.id AND username = NEW.username FOR UPDATE;
        IF (SELECT true
            FROM   usr
            WHERE  id <> NEW.id AND username = NEW.username AND usr.state > -1
            LIMIT 1
        ) THEN
            RAISE EXCEPTION ''duplicate key violates unique constraint "ck_person_usr_username_unique"'';
        END IF;
    END IF;
    RETURN NEW;
    END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER cku_usr_username_unique BEFORE UPDATE ON person_usr
    FOR EACH ROW EXECUTE PROCEDURE cku_usr_username_unique();

CREATE FUNCTION ckp_usr_username_unique() RETURNS trigger AS $$
    BEGIN
    IF (NEW.state > -1 AND OLD.state < 0
        AND (SELECT true FROM person_usr WHERE id = NEW.id)
        ) THEN
        /* Lock the relevant records in the parent and child tables. */
        PERFORM true
        FROM    person_usr, person
        WHERE   person_usr.id = person.id
                AND username = (SELECT username FROM person_usr WHERE id = NEW.id)
        FOR UPDATE;

        IF (SELECT COUNT(username)
            FROM   person_usr
            WHERE username = (SELECT username FROM person_usr WHERE id = NEW.id)
        ) > 1 THEN
            RAISE EXCEPTION ''duplicate key violates unique constraint "ck_person_usr_username_unique"'';
        END IF;
    END IF;
    RETURN NEW;
    END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER ckp_usr_username_unique BEFORE UPDATE ON person
    FOR EACH ROW EXECUTE PROCEDURE ckp_usr_username_unique();

Why am I locking rows? To prevent some other transaction from changing another row to create username conflicts. For example, while I might be changing a username to “foo” for one record, an existing record with that username but its state set to -1 might be getting activated in a separate transaction. So gotta try to prevent that. Josh Berkus pointed out that issue in an earlier iteration of the triggers.

Anyway, am I on crack here? Isn’t there a simpler way to do this sort of thing? And if not, have I really got the race conditions all eliminated with the row locks?

Update: In further testing, I discovered that the SELECT ... FOR UPDATE was failing on views with the error “ERROR: no relation entry for relid 7”. I have no idea what that means, but I found that it didn’t happen when I selected against the tables directly. So I’ve updated the functions above to reflect that change. I’ve also fixed a few pastos as pointed out in the comments.

Looking for the comments? Try the old layout.

Enforce Foreign Key Integrity in SQLite with Triggers

After some some Googling and experimentation, I’ve figured out how to enforce foreign key constraints in SQLite. I got most of the code from Cody Pisto’s sqlite_fk utility. I couldn’t get it to work, but the essential code for the triggers was in its fk.c file, so I just borrowed from that (public domain) code to figure it out.

Since I couldn’t find documentation for this elsewhere on the Net (though I’m sure it exists somewhere), I decided to just put an example here. Interested? Read on!

Say you have these two table declarations:

create table foo (
    id INTEGER NOT NULL PRIMARY KEY
);

CREATE TABLE bar (
    id INTEGER NOT NULL PRIMARY KEY,
    foo_id INTEGER NOT NULL
            CONSTRAINT fk_foo_id REFERENCES a(id) ON DELETE CASCADE
);

Table bar has a foreign key reference to the primary key column in the foo table. Although SQLite supports this syntax (as well as named foreign key constraints), it ignores them. So if you want the references enforced, you need to create triggers to do the job. Triggers were added to SQLite version 2.5, so most users can take advantage of this feature. Each constraint must have three triggers: one for INSERTs, one for UPDATESs, and one for DELETESs. The INSERT trigger looks like this:

CREATE TRIGGER fki_bar_foo_id
BEFORE INSERT ON bar
FOR EACH ROW BEGIN 
    SELECT CASE
        WHEN ((SELECT id FROM foo WHERE id = NEW.foo_id) IS NULL)
        THEN RAISE(ABORT, 'insert on table "bar" violates foreign key '
                || 'constraint "fk_foo_id"')
    END;
END;

(You can put the RAISE error string all on one line; I’ve concatenated two lines to keep line lengths reasonable here.) If your foreign key column is not NOT NULL, the trigger’s SELECT CASE clause needs to an extra case:

CREATE TRIGGER fki_bar_foo_id
BEFORE INSERT ON bar
FOR EACH ROW BEGIN 
    SELECT CASE
        WHEN ((new.foo_id IS NOT NULL)
            AND ((SELECT id FROM foo WHERE id = new.foo_id) IS NULL))
        THEN RAISE(ABORT, 'insert on table "bar" violates foreign key '
                || 'constraint "fk_foo_id"')
    END;
END;

The UPDATE statements are almost identical; if your foreign key column is NOT NULL, then do this:

CREATE TRIGGER fku_bar_foo_id
BEFORE UPDATE ON bar
FOR EACH ROW BEGIN 
    SELECT CASE
        WHEN ((SELECT id FROM foo WHERE id = new.foo_id) IS NULL))
        THEN RAISE(ABORT, 'update on table "bar" violates foreign key '
                || 'constraint "fk_foo_id"')
    END;
END;

And if NULLs are allowed, do this:

CREATE TRIGGER fku_bar_foo_id
BEFORE UPDATE ON bar
FOR EACH ROW BEGIN 
    SELECT CASE
        WHEN ((new.foo_id IS NOT NULL)
            AND ((SELECT id FROM foo WHERE id = new.foo_id) IS NULL))
        THEN RAISE(ABORT, 'update on table "bar" violates foreign key '
                || 'constraint "fk_foo_id"')
    END;
END;

The DELETE trigger is, of course, the reverse of the INSERT and UPDATE triggers, in that it applies to the primary key table, rather than the foreign key table. To whit, in our example, it watches for DELETEs on the foo table:

CREATE TRIGGER fkd_bar_foo_id
BEFORE DELETE ON foo
FOR EACH ROW BEGIN 
    SELECT CASE
    WHEN ((SELECT foo_id FROM bar WHERE foo_id = OLD.id) IS NOT NULL)
    THEN RAISE(ABORT, 'delete on table "foo" violates foreign key '
                || ' constraint "fk_foo_id"')
    END;
END;

This trigger will prevent DELETEs on the foo table when there are existing foreign key references in the bar table. This is generally the default behavior in databases with referential integrity enforcement, sometimes specified explicitly as ON DELETE RESTRICT. But sometimes you want the deletes in the primary key table to “cascade” to the foreign key tables. Such is what our example declaration above specifies, and this is the trigger to to the job:

CREATE TRIGGER fkd_bar_foo_id
BEFORE DELETE ON foo
FOR EACH ROW BEGIN 
    DELETE from bar WHERE foo_id = OLD.id;
END;

Pretty simple, eh? The trigger support in SQLite is great for building your own referential integrity checks. Hopefully, these examples will get you started down the path of creating your own.

Looking for the comments? Try the old layout.

More about…