Just a Theory

Trans rights are human rights

Neither NULL nor NOT NULL: An SQL WTF

While working on result set testing functions, I ran into a bit of weirdness when comparing rows between two cursors. I had code that looked more or less like this:

FETCH have INTO rec_have;
FETCH want INTO rec_want;
WHILE rec_have IS NOT NULL OR rec_want IS NOT NULL LOOP
    IF rec_have IS DISTINCT FROM rec_want THEN
        RETURN FALSE;
    END IF;
    FETCH have INTO rec_have;
    FETCH want INTO rec_want;
END LOOP;
RETURN TRUE;

Basically, the idea is to return true if the two cursors return equivalent rows in the same order. However, things started to get weird when any of the rows included a NULL: it seemed that the loop exited as soon as a NULL was encountered, even if there were also non-NULL values in the row. I poked around a bit and discovered, to my astonishment, that such a record is neither NULL nor NOT NULL:

try=# select ROW(1, NULL) IS NULL;
 ?column? 
----------
 f
(1 row)

try=# select ROW(1, NULL) IS NOT NULL;
 ?column? 
----------
 f
(1 row)

I had thought that a value, even a composite value, had to be either NULL or NOT NULL, so I thought it was a bug. I mean, this isn’t possible, is it? I dutifully asked on the pgsql-hackers list and was informed, to further astonishment, that this is, in fact, mandated by the SQL standard. WTF? As Jeff says, “The standard is what it is. If it says that some NULLs are red and some NULLs are blue, then we’d probably support it.”

Through the discussion, I learned that a record is considered NULL only if all of its values are NULL, and it’s considered NOT NULL only if none of it s values are NULL:

try=# select ROW(NULL, NULL) IS NULL;
 ?column? 
----------
 t
(1 row)

try=# select ROW(1, 1) IS NOT NULL;
 ?column? 
----------
 t

The upshot is that composite values with at least one NULL and at least one NOT NULL value are ambiguous. It’s insane, but there you have it.

Jeff thought that you could cheat the standard by moving the NOT in front of the value before checking its NULLness. I changed my code to reflect this, and things got better:

FETCH have INTO rec_have;
FETCH want INTO rec_want;
WHILE NOT rec_have IS NULL OR NOT rec_want IS NULL LOOP
    IF rec_have IS DISTINCT FROM rec_want THEN
        RETURN FALSE;
    END IF;
    FETCH have INTO rec_have;
    FETCH want INTO rec_want;
END LOOP;
RETURN TRUE;

Kind of confusing to read, but at least it’s not too ugly. In truth, however, it’s still inconsistent: it just makes it so that such records are both NULL and NOT NULL:

try=# select NOT ROW(1, NULL) IS NULL;
 ?column? 
----------
 t
(1 row)

try=# select NOT ROW(1, NULL) IS NOT NULL;
 ?column? 
----------
 t
(1 row)

But it was good enough for me. For a while. But then I started testing the pathological scenario where a row contains only NULLs. I call it pathological because no one should ever have rows with only NULLs. But the truth is that the SQL standard allows it (despite the objections of relational theorists) and, I’ve little doubt, application developers get such rows all the time.

The problem with such rows is that they are inherently NULL, but, according to the documentation for the use of cursors in PL/pgSQL, when fetching rows from a cursor, “if there is no next row, the target is set to NULL(s).” The upshot is that, because I’m using a WHILE loop to fetch rows from a cursor, and rows with only NULLs are themselves considered NULL, there is no way to tell the difference between a row that contains NULLs and the end of a cursor.

To demonstrate, I sent an example of two functions that process a cursor, one using a plain PL/pgSQL FOR rec IN stuff LOOP, which internally detects the difference between rows full of NULLs and the end of the cursor, and one using the WHILE NOT rec IS NULL LOOP syntax required by the pgTAP testing functions. The output looked like this:

    dob     |     ssn
------------+-------------
 1965-12-31 |
            |
            | 932-45-3456
 1963-03-23 | 123-45-6789
(4 rows)

    dob     | ssn
------------+-----
 1965-12-31 |
(1 row)

The two functions are processing the same query in cursors, but while the FOR loop properly returned all four rows, the WHILE loop stopped when it hit a row with only NULLs. I found this annoying, to say the least. Fortunately, other folks were paying better attention to the docs, pointing out that the special PL/pgSQL FOUND variable does just the trick, being set to TRUE when a row is fetched, even if the row is all NULLs, and false then there are no more rows in the cursor. In fact, had I read two more sentences in the relevant documentation, I would have noticed that it says, “As with SELECT INTO, the special variable FOUND can be checked to see whether a row was obtained or not.” D’oh!

So now my function looks more or less like this:

FETCH have INTO rec_have;
have_found := FOUND;
FETCH want INTO rec_want;
want_found := FOUND;
WHILE have_found OR want_found LOOP
    IF rec_have IS DISTINCT FROM rec_want THEN
        RETURN FALSE;
    END IF;
    FETCH have INTO rec_have;
    have_found := FOUND;
    FETCH want INTO rec_want;
    want_found := FOUND;
END LOOP;
RETURN TRUE;

Yeah, pretty verbose and full of a lot of explicit processing that I can just take for granted in more sane languages, but it does the trick. Don’tcha just love SQL?

That issue behind me, I’ll do a bit more hacking on it this week, and hopefully I’ll get a release of pgTAP out with the new result set testing support before I leave for vacation early next week.

Looking for the comments? Try the old layout.