Just a Theory

Trans rights are human rights

Posts about postgres

JSON Path Operator Confusion

The CipherDoc service offers a robust secondary key lookup API and search interface powered by JSON/SQL Path queries run against a GIN-indexed JSONB column. SQL/JSON Path, introduced in SQL:2016 and added to Postgres in version 12 in 2019, nicely enables an end-to-end JSON workflow and entity lifecycle. It’s a powerful enabler and fundamental technology underpinning CipherDoc. I’m so happy to have found it.

Confusion

However, the distinction between the SQL/JSON Path operators @@ and @? confused me. Even as I found that the @? operator worked for my needs and @@ did not, I tucked the problem into my mental backlog for later study.

The question arose again on a recent work project, and I can take a hint. It’s time to figure this thing out. Let’s see where it goes.

The docs say:

jsonb @? jsonpath → boolean
Does JSON path return any item for the specified JSON value?

'{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] ? (@ > 2)' → t


jsonb @@ jsonpath → boolean
Returns the result of a JSON path predicate check for the specified JSON value. Only the first item of the result is taken into account. If the result is not Boolean, then NULL is returned.

'{"a":[1,2,3,4,5]}'::jsonb @@ '$.a[*] > 2' → t

These read quite similarly to me: Both return true if the path query returns an item. So what’s the difference? When should I use @@ and when @?? I went so far as to ask Stack Overflow about it. The one answer directed my attention back to the jsonb_path_query() function, which returns the results from a path query.

So let’s explore how various SQL/JSON Path queries work, what values various expressions return.

Queries

The docs for jsonb_path_query say:1

jsonb_path_query ( target jsonb, path jsonpath [, vars jsonb [, silent boolean ]] ) → setof jsonb
Returns all JSON items returned by the JSON path for the specified JSON value. If the vars argument is specified, it must be a JSON object, and its fields provide named values to be substituted into the jsonpath expression. If the silent argument is specified and is true, the function suppresses the same errors as the @? and @@ operators do.
select * from jsonb_path_query(
    '{"a":[1,2,3,4,5]}',
    '$.a[*] ? (@ >= $min && @ <= $max)',
    '{"min":2, "max":4}'
) →
 jsonb_path_query
------------------
 2
 3
 4

The first thing to note is that a SQL/JSON Path query may return more than one value. This feature matters for the @@ and @? operators, which return a single boolean value based on the values returned by a path query. And path queries can return a huge variety of values. Let’s explore some examples, derived from the sample JSON value and path query from the docs.2

select jsonb_path_query('{"a":[1,2,3,4,5]}', '$ ?(@.a[*] > 2)');
    jsonb_path_query    
------------------------
 {"a": [1, 2, 3, 4, 5]}
(1 row)

This query returns the entire JSON value, because that’s what $ selects at the start of the path expression. The ?() filter returns true because its predicate expression finds at least one value in the $.a array greater than 2. Here’s what happens when the filter returns false:

select jsonb_path_query('{"a":[1,2,3,4,5]}', '$ ?(@.a[*] > 5)');
 jsonb_path_query 
------------------
(0 rows)

None of the values in the $.a array are greater than five, so the query returns no value.

To select just the array, append it to the path expression after the ?() filter:

select jsonb_path_query('{"a":[1,2,3,4,5]}', '$ ?(@.a[*] > 2).a');
 jsonb_path_query 
------------------
 [1, 2, 3, 4, 5]
(1 row)

Path Modes

One might think you could select $.a at the start of the path query to get the full array if the filter returns true, but look what happens:

select jsonb_path_query('{"a":[1,2,3,4,5]}', '$.a ?(@[*] > 2)');
 jsonb_path_query 
------------------
 3
 4
 5
(3 rows)

That’s not the array, but the individual array values that each match the predicate. Turns out this is a quirk of the Postgres implementation of path modes. From what I can glean, the SQL:2016 standard dictates something like these SQL Server descriptions:

  • In lax mode, the function returns empty values if the path expression contains an error. For example, if you request the value $.name, and the JSON text doesn’t contain a name key, the function returns null, but does not raise an error.
  • In strict mode, the function raises an error if the path expression contains an error.

But the Postgres lax mode does more than suppress errors. From the docs (emphasis added):

The lax mode facilitates matching of a JSON document structure and path expression if the JSON data does not conform to the expected schema. If an operand does not match the requirements of a particular operation, it can be automatically wrapped as an SQL/JSON array or unwrapped by converting its elements into an SQL/JSON sequence before performing this operation. Besides, comparison operators automatically unwrap their operands in the lax mode, so you can compare SQL/JSON arrays out-of-the-box.

There are a few more details, but this is the crux of it: In lax mode, which is the default, Postgres always unwraps an array. Hence the unexpected list of results.3 This could be particularly confusing when querying multiple rows:

select jsonb_path_query(v, '$.a ?(@[*] > 2)')
        from (values ('{"a":[1,2,3,4,5]}'::jsonb), ('{"a":[3,5,8]}')) x(v);
 jsonb_path_query 
------------------
 3
 4
 5
 3
 5
 8
(6 rows)

Switching to strict mode by preprending strict to the JSON Path query restores the expected behavior:

select jsonb_path_query(v, 'strict $.a ?(@[*] > 2)')
        from (values ('{"a":[1,2,3,4,5]}'::jsonb), ('{"a":[3,5,8]}')) x(v);
 jsonb_path_query 
------------------
 [1, 2, 3, 4, 5]
 [3, 5, 8]
(2 rows)

Important gotcha to watch for, and a good reason to test path queries thoroughly to ensure you get the results you expect. Lax mode nicely prevents errors when a query references a path that doesn’t exist, as this simple example demonstrates:

select jsonb_path_query('{"a":[1,2,3,4,5]}', 'strict $.b');
ERROR:  JSON object does not contain key "b"

select jsonb_path_query('{"a":[1,2,3,4,5]}', 'lax $.b');
 jsonb_path_query 
------------------
(0 rows)

In general, I suggest always using strict mode when executing queries. Better still, perhaps always prefer strict mode with our friends the @@ and @? operators, which suppress some errors even in strict mode:

The jsonpath operators @? and @@ suppress the following errors: missing object field or array element, unexpected JSON item type, datetime and numeric errors. The jsonpath-related functions described below can also be told to suppress these types of errors. This behavior might be helpful when searching JSON document collections of varying structure.

Have a look:

select '{"a":[1,2,3,4,5]}' @? 'strict $.a';
 ?column? 
----------
 t
(1 row)

select '{"a":[1,2,3,4,5]}' @? 'strict $.b';
 ?column? 
----------
 <null>
(1 row)

No error for the unknown JSON key b in that second query! As for the error suppression in the jsonpath-related functions, that’s what the silent argument does. Compare:

select jsonb_path_query('{"a":[1,2,3,4,5]}', 'strict $.b');
ERROR:  JSON object does not contain key "b"

select jsonb_path_query('{"a":[1,2,3,4,5]}', 'strict $.b', '{}', true);
 jsonb_path_query 
------------------
(0 rows)

Boolean Predicates

The Postgres SQL/JSON Path Language docs briefly mention a pretty significant deviation from the SQL standard:

A path expression can be a Boolean predicate, although the SQL/JSON standard allows predicates only in filters. This is necessary for implementation of the @@ operator. For example, the following jsonpath expression is valid in PostgreSQL:

$.track.segments[*].HR < 70

This pithy statement has pretty significant implications for the return value of a path query. The SQL standard allows predicate expressions, which are akin to an SQL WHERE expression, only in ?() filters, as seen previously:

select jsonb_path_query('{"a":[1,2,3,4,5]}', '$ ?(@.a[*] > 2)');
    jsonb_path_query    
------------------------
 {"a": [1, 2, 3, 4, 5]}
(1 row)

This can be read as “return the path $ if @.a[*] > 2 is true. But have a look at a predicate-only path query:

select jsonb_path_query('{"a":[1,2,3,4,5]}', '$.a[*] > 2');
 jsonb_path_query 
------------------
 true
(1 row)

This path query can be read as “Return the result of the predicate $.a[*] > 2, which in this case is true. This is quite the divergence from the standard, which returns contents from the JSON queried, while a predicate query returns the result of the predicate expression itself. It’s almost like they’re two different things!

Don’t confuse the predicate path query return value with selecting a boolean value from the JSON. Consider this example:

select jsonb_path_query('{"a":[true,false]}', '$.a ?(@[*] == true)');
 jsonb_path_query 
------------------
 true
(1 row)

Looks the same as the predicate-only query, right? But it’s not, as shown by adding another true value to the $.a array:

select jsonb_path_query('{"a":[true,false,true]}', '$.a ?(@[*] == true)');
 jsonb_path_query 
------------------
 true
 true
(2 rows)

This path query returns the trues it finds in the $.a array. The fact that it returns values from the JSON rather than the filter predicate becomes more apparent in strict mode, which returns all of $a if one or more elements of the array has the value true:

select jsonb_path_query('{"a":[true,false,true]}', 'strict $.a ?(@[*] == true)');
  jsonb_path_query   
---------------------
 [true, false, true]
(1 row)

This brief aside, and its mention of the @@ operator, turns out to be key to understanding the difference between @? and @@. Because it’s not just that this feature is “necessary for implementation of the @@ operator”. No, I would argue that it’s the only kind of expression usable with the @@ operator

Match vs. Exists

Let’s get back to the @@ operator. We can use a boolean predicate JSON Path like so:

select '{"a":[1,2,3,4,5]}'::jsonb @@ '$.a[*] > 2';
 ?column? 
----------
 t
(1 row)

It returns true because the predicate JSON path query $.a[*] > 2 returns true. And when it returns false?

select '{"a":[1,2,3,4,5]}'::jsonb @@ '$.a[*] > 6';
 ?column? 
----------
 f
(1 row)

So far so good. What happens when we try to use a filter expression that returns a true value selected from the JSONB?

select '{"a":[true,false]}'::jsonb @@ '$.a ?(@[*] == true)';
 ?column? 
----------
 t
(1 row)

Looks right, doesn’t it? But recall that this query returns all of the true values from $.@, but @@ wants only a single boolean. What happens when we add another?

select '{"a":[true,false,true]}'::jsonb @@ 'strict $.a ?(@[*] == true)';
 ?column? 
----------
 <null>
(1 row)

Now it returns NULL, even though it’s clearly true that @[*] == true matches. This is because it returns all of the values it matches, as jsonb_path_query() demonstrates:

select jsonb_path_query('{"a":[true,false,true]}'::jsonb, '$.a ?(@[*] == true)');
 jsonb_path_query 
------------------
 true
 true
(2 rows)

This clearly violates the @@ documentation claim that “Only the first item of the result is taken into account”. If that were true, it would see the first value is true and return true. But it doesn’t. Turns out, the corresponding jsonb_path_match() function shows why:

select jsonb_path_match('{"a":[true,false,true]}'::jsonb, '$.a ?(@[*] == true)');
ERROR:  single boolean result is expected

Conclusion: The documentation is inaccurate. Only a single boolean is expected by @@. Anything else is an error.

Futhermore, it’s dangerous, at best, to use an SQL standard JSON Path expression with @@. If you need to use it with a filter expression, you can turn it into a boolean predicate by wrapping it in exists():

select jsonb_path_match('{"a":[true,false,true]}'::jsonb, 'exists($.a ?(@[*] == true))');
 jsonb_path_match 
------------------
 t
(1 row)

But there’s no reason to do so, because that’s effectively what the @? operator (and the corresponding, cleverly-named jsonb_path_exists() function does): it returns true if the SQL standard JSON Path expression contains any results:

select '{"a":[true,false,true]}'::jsonb @? '$.a ?(@[*] == true)';
 ?column? 
----------
 t
(1 row)

Here’s the key thing about @?: you don’t want to use a boolean predicate path query with it, either. Consider this predicate-only query:

select jsonb_path_query('{"a":[1,2,3,4,5]}'::jsonb, '$.a[*] > 6');
 jsonb_path_query 
------------------
 false
(1 row)

But see what happens when we use it with @?:

select '{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] > 6';
 ?column? 
----------
 t
(1 row)

It returns true even though the query itself returns false! Why? Because false is a value that exists and is returned by the query. Even a query that returns null is considered to exist, as it will when a strict query encounters an error:

select jsonb_path_query('{"a":[1,2,3,4,5]}'::jsonb, 'strict $[*] > 6');
 jsonb_path_query 
------------------
 null
(1 row)

select '{"a":[1,2,3,4,5]}'::jsonb @? 'strict $[*] > 6';
 ?column? 
----------
 t
(1 row)

The key thing to know about the @? operator is that it returns true if anything is returned by the path query, and returns false only if nothing is selected at all.

The Difference

In summary, the difference between the @? and @@ JSONB operators is this:

  • @? (and jsonb_path_exists()) returns true if the path query returns any values — even false or null — and false if it returns no values. This operator should be used only with SQL-standard JSON path queries that select data from the JSONB. Do not use predicate-only JSON path expressions with @?.
  • @@ (and jsonb_path_match()) returns true if the path query returns the single boolean value true and false otherwise. This operator should be used only with Postgres-specific boolean predicate JSON path queries, that return data from the predicate expression. Do not use SQL-standard JSON path expressions with @@.

This difference of course assumes awareness of this distinction between predicate path queries and SQL standard path queries. To that end, I submitted a patch that expounds the difference between these types of JSON Path queries, and plan to submit another linking these differences in the docs for @@ and @?.

Oh, and probably another to explain the difference in return values between strict and lax queries due to array unwrapping.

Thanks

Many thanks to Erik Wienhold for patiently answering my pgsql-hackers questions and linking me to a detailed pgsql-general thread in which the oddities of @@ were previously discussed in detail.


  1. Well almost. The docs for jsonb_path_query actually say, about the last two arguments, “The optional vars and silent arguments act the same as for jsonb_path_exists.” I replaced that sentence with the relevant sentences from the jsonb_path_exists docs, about which more later. ↩︎

  2. Though omitting the vars argument, as variable interpolation just gets in the way of understanding basic query result behavior. ↩︎

  3. In fairness, the Oracle docs also discuss “implicit array wrapping and unwrapping”, but I don’t have a recent Oracle server to experiment with at the moment. ↩︎

CipherDoc: A Searchable, Encrypted JSON Document Service on Postgres

Over the last year, I designed and implemented a simple web service, code-named “CipherDoc”, that provides a CRUD API for creating, updating, searching, and deleting JSON documents. The app enforces document structure via JSON schema, while JSON/SQL Path powers the search API by querying a hashed subset of the schema stored in a GIN-indexed JSONB column in Postgres.

In may I gave a public presentation on the design and implementation of the service at PGCon: CipherDoc: A Searchable, Encrypted JSON Document Service on Postgres. Links:

I enjoyed designing this service. The ability to dynamically change the JSON schema at runtime without database changes enables more agile development cycles for busy teams. Its data privacy features required a level of intellectual challenge and raw problem-solving (a.k.a., engineering) that challenge and invigorate me.

Two minor updates since May:

  1. I re-implemented the JSON/SQL Path parser using the original Postgres path grammar and goyacc, replacing the hand-written parser roundly castigated in the presentation.
  2. The service has yet to be open-sourced, but I remain optimistic, and continue to work with leadership at The Times towards an open-source policy to enable its release.

Automate Postgres Extension Releases on GitHub and PGXN

Back in June, I wrote about testing Postgres extensions on multiple versions of Postgres using GitHub Actions. The pattern relies on Docker image, pgxn/pgxn-tools, which contains scripts to build and run any version of PostgreSQL, install additional dependencies, build, test, bundle, and release an extension. I’ve since updated it to support testing on the the latest development release of Postgres, meaning one can test on any major version from 8.4 to (currently) 14. I’ve also created GitHub workflows for all of my PGXN extensions (except for pgTAP, which is complicated). I’m quite happy with it.

But I was never quite satisfied with the release process. Quite a number of Postgres extensions also release on GitHub; indeed, Paul Ramsey told me straight up that he did not want to manually upload extensions like pgsql-http and PostGIS to PGXN, but for PGXN to automatically pull them in when they were published on GitHub. It’s pretty cool that newer packaging systems like pkg.go.dev auto-index any packages on GibHub. Adding such a feature to PGXN would be an interesting exercise.

But since I’m low on TUITs for such a significant undertaking, I decided instead to work out how to automatically publish a release on GitHub and PGXN via GitHub Actions. After experimenting for a few months, I’ve worked out a straightforward method that should meet the needs of most projects. I’ve proven the pattern via the pair extension’s release.yml, which successfully published the v0.1.7 release today on both GitHub and PGXN. With that success, I updated the pgxn/pgxn-tools documentation with a starter example. It looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
name: Release
on:
  push:
    tags:
      - 'v*' # Push events matching v1.0, v20.15.10, etc.
jobs:
  release:
    name: Release on GitHub and PGXN
    runs-on: ubuntu-latest
    container: pgxn/pgxn-tools
    env:
      # Required to create GitHub release and upload the bundle.
      GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
    steps:
    - name: Check out the repo
      uses: actions/checkout@v3
    - name: Bundle the Release
      id: bundle
      run: pgxn-bundle
    - name: Release on PGXN
      env:
        # Required to release on PGXN.
        PGXN_USERNAME: ${{ secrets.PGXN_USERNAME }}
        PGXN_USERNAME: ${{ secrets.PGXN_PASSWORD }}
      run: pgxn-release
    - name: Create GitHub Release
      id: release
      uses: actions/create-release@v1
      with:
        tag_name: ${{ github.ref }}
        release_name: Release ${{ github.ref }}
        body: |
          Changes in this Release
          - First Change
          - Second Change          
    - name: Upload Release Asset
      uses: actions/upload-release-asset@v1
      with:
        # Reference the upload URL and bundle name from previous steps.
        upload_url: ${{ steps.release.outputs.upload_url }}
        asset_path: ./${{ steps.bundle.outputs.bundle }}
        asset_name: ${{ steps.bundle.outputs.bundle }}
        asset_content_type: application/zip

Here’s how it works:

  • Lines 4-5 trigger the workflow only when a tag starting with the letter v is pushed to the repository. This follows the common convention of tagging releases with version numbers, such as v0.1.7 or v4.6.0-dev. This assumes that the tag represents the commit for the release.

  • Line 10 specifies that the job run in the pgxn/pgxn-tools container, where we have our tools for building and releasing extensions.

  • Line 13 passes the GITHUB_TOKEN variable into the container. This is the GitHub personal access token that’s automatically set for every build. It lets us call the GitHub API via actions later in the workflow.

  • Step “Bundle the Release”, on Lines 17-19, validates the extension META.json file and creates the release zip file. It does so by simply reading the distribution name and version from the META.json file and archiving the Git repo into a zip file. If your process for creating a release file is more complicated, you can do it yourself here; just be sure to include an id for the step, and emit a line of text so that later actions know what file to release. The output should be appended to the $GITHUB_OUTPUT file like this, with $filename representing the name of the release file, usually $extension-$version.zip:

    echo bundle=$filename >> $GITHUB_OUTPUT
    
  • Step “Release on PGXN”, on lines 20-25, releases the extension on PGXN. We take this step first because it’s the strictest, and therefore the most likely to fail. If it fails, we don’t end up with an orphan GitHub release to clean up once we’ve fixed things for PGXN.

  • With the success of a PGXN release, step “Create GitHub Release”, on lines 26-35, uses the GitHub create-release action to create a release corresponding to the tag. Note the inclusion of id: release, which will be referenced below. You’ll want to customize the body of the release; for the pair extension, I added a simple make target to generate a file, then pass it via the body_path config:

    - name: Generate Release Changes
      run: make latest-changes.md
    - name: Create GitHub Release
      id: release
      uses: actions/create-release@v1
      with:
        tag_name: ${{ github.ref }}
        release_name: Release ${{ github.ref }}
        body_path: latest-changes.md
    
  • Step “Upload Release Asset”, on lines 36-43, adds the release file to the GitHub release, using output of the release step to specify the URL to upload to, and the output of the bundle step to know what file to upload.

Lotta steps, but works nicely. I only wish I could require that the testing workflow finish before doing a release, but I generally tag a release once it has been thoroughly tested in previous commits, so I think it’s acceptable.

Now if you’ll excuse me, I’m off to add this workflow to my other PGXN extensions.

Test Postgres Extensions With GitHub Actions

I first heard about GitHub Actions a couple years ago, but fully embraced them only in the last few weeks. Part of the challenge has been the paucity of simple but realistic examples, and quite a lot of complicated-looking JavaScript-based actions that seem like overkill. But through trial-and-error, I figured out enough to update my Postgres extensions projects to automatically test on multiple versions of Postgres, as well as to bundle and release them on PGXN. The first draft of that effort is pgxn/pgxn-tools1, a Docker image with scripts to build and run any version of PostgreSQL between 8.4 and 12, install additional dependencies, build, test, bundle, and release an extension.

Here’s how I’ve put it to use in a GitHub workflow for semver, the Semantic Version data type:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
name: CI
on: [push, pull_request]
jobs:
  test:
    strategy:
      matrix:
        pg: [12, 11, 10, 9.6, 9.5, 9.4, 9.3, 9.2]
    name: 🐘 PostgreSQL ${{ matrix.pg }}
    runs-on: ubuntu-latest
    container: pgxn/pgxn-tools
    steps:
      - run: pg-start ${{ matrix.pg }}
      - uses: actions/checkout@v3
      - run: pg-build-test

The important bits are in the jobs.test object. Under strategy.matrix, which defines the build matrix, the pg array defines each version to be tested. The job will run once for each version, and can be referenced via ${{ matrix.pg }} elsewhere in the job. Line 10 has the job a pgxn/pgxn-tools container, where the steps run. The are are:

  • Line 12: Install and start the specified version of PostgreSQL
  • Line 13: Clone the semver repository
  • Line 14: Build and test the extension

The intent here is to cover the vast majority of cases for testing Postgres extensions, where a project uses PGXS Makefile. The pg-build-test script does just that.

A few notes on the scripts included in pgxn/pgxn-tools:

  • pg-start installs, initializes, and starts the specified version of Postgres. If you need other dependencies, simply list their Debian package names after the Postgres version.

  • pgxn is a client for PGXN itself. You can use it to install other dependencies required to test your extension.

  • pg-build-test simply builds, installs, and tests a PostgreSQL extension or other code in the current directory. Effectively the equivalent of make && make install && make installcheck.

  • pgxn-bundle validates the PGXN META.json file, reads the distribution name and version, and bundles up the project into a zip file for release to PGXN.

  • pgxn-release uploads a release zip file to PGXN.

In short, use the first three utilities to handle dependencies and test your extension, and the last two to release it on PGXN. Simply set GitHub secrets with your PGXN credentials, pass them in environment variables named PGXN_USERNAME and PGXN_PASSWORD, and the script will handle the rest. Here’s how a release job might look:

15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
  release:
    name: Release on PGXN
    # Release pon push to main when the test job succeeds.
    needs: test
    if: github.ref == 'refs/heads/main' && github.event_name == 'push' && needs.test.result == 'success'
    runs-on: ubuntu-latest
    container:
      image: pgxn/pgxn-tools
      env:
        PGXN_USERNAME: ${{ secrets.PGXN_USERNAME }}
        PGXN_PASSWORD: ${{ secrets.PGXN_PASSWORD }}
    steps:
      - name: Check out the repo
        uses: actions/checkout@v3
      - name: Bundle the Release
        run: pgxn-bundle
      - name: Release on PGXN
        run: pgxn-release

Note that lines 18-19 require that the test job defined above pass, and ensure the job runs only on a push event to the main branch, where we push final releases. We set PGXN_USERNAME and PGXN_PASSWORD from the secrets of the same name, and then, in lines 27-32, check out the project, bundle it into a zip file, and release it on PGXN.

There are a few more features of the image, so read the docs for the details. As a first cut at PGXN CI/CD tools, I think it’s fairly robust. Still, as I gain experience and build and release more extensions in the coming year, I expect to work out integration with publishing GitHub releases, and perhaps build and publish relevant actions on the GitHub Marketplace.


  1. Not a great name, I know, will probably change as I learn more. ↩︎

pgenv

For years, I’ve managed multiple versions of PostgreSQL by regularly editing and running a simple script that builds each major version from source and installs it in /usr/local. I would shut down the current version, remove the symlink to /usr/local/pgsql, symlink the one I wanted, and start it up again.

This is a pain in the ass.

Recently I wiped my work computer (because reasons) and started reinstalling all my usual tools. PostgreSQL, I decided, no longer needs to run as the postgres user from /usr/local. What would be much nicer, when it came time to test pgTAP against all supported versions of Postgres, would be to use a tool like plenv or rbenv to do all the work for me.

So I wrote pgenv. To use it, clone it into ~/.pgenv (or wherever you want) and add its bin directories to your $PATH environment variable:

git clone https://github.com/theory/pgenv.git
echo 'export PATH="$HOME/.pgenv/bin:$HOME/.pgenv/pgsql/bin:$PATH"' >> ~/.bash_profile

Then you’re ready to go:

pgenv build 10.4

A few minutes later, it’s there:

$ pgenv versions
pgsql-10.4

Let’s use it:

$ pgenv use 10.4
The files belonging to this database system will be owned by user "david".
This user must also own the server process.
#    (initdb output elided)
waiting for server to start.... done
server started
PostgreSQL 10.4 started

Now connect:

$ psql -U postgres
psql (10.4)
Type "help" for help.

postgres=# 

Easy. Each version you install – as far back as 8.0 – has the default super user postgres for compatibility with the usual system-installed version. It also builds all contrib modules, including PL/Perl using /usr/bin/perl.

With this little app in place, I quickly built all the versions I need. Check it out:

$ pgenv versions
     pgsql-10.3
  *  pgsql-10.4
     pgsql-11beta2
     pgsql-8.0.26
     pgsql-8.1.23
     pgsql-8.2.23
     pgsql-8.3.23
     pgsql-8.4.22
     pgsql-9.0.19
     pgsql-9.1.24
     pgsql-9.2.24
     pgsql-9.3.23
     pgsql-9.4.18
     pgsql-9.5.13
     pgsql-9.6.9

Other commands include start, stop, and restart, which act on the currently active version; version, which shows the currently-active version (also indicated by the asterisk in the output of the versions command); clear, to clear the currently-active version (in case you’d rather fall back on a system-installed version, for example); and remove, which will remove a version. See the docs for details on all the commands.

How it Works

All this was written in an uncomplicated Bash script. I’ve ony tested it on a couple of Macs, so YMMV, but as long as you have Bash, Curl, and /usr/bin/perl on a system, it ought to just work.

How it works is by building each version in its own directory: ~/.pgenv/pgsql-10.4, ~/.pgenv/pgsql-11beta2, and so on. The currently-active version is nothing more than symlink, ~/.pgenv/pgsql, to the proper version directory. There is no other configuration. pgenv downloads and builds versions in the ~/.pgenv/src directory, and the tarballs and compiled source left in place, in case they’re needed for development or testing. pgenv never uses them again unless you delete a version and pgenv build it again, in which case pgenv deletes the old build directory and unpacks from the tarball again.

Works for Me!

Over the last week, I hacked on pgenv to get all of these commands working. It works very well for my needs. Still, I think it might be useful to add support for a configuration file. It might allow one to change the name of the default superuser, the location Perl, and perhaps a method to change postgresql.conf settings following an initdb. I don’t know when (or if) I’ll need that stuff, though. Maybe you do, though? Pull requests welcome!

But even if you don’t, give it a whirl and let me know if you find any issues.

Indexing Nested hstore

In my first Nested hstore post yesterday, I ran a query against unindexed hstore data, which required a table scan. But hstore is able to take advantage of GIN indexes. So let’s see what that looks like. Connecting to the same database, I indexed the review column:

reviews=# CREATE INDEX idx_reviews_gin ON reviews USING GIN(review);
CREATE INDEX
Time: 360448.426 ms
reviews=# SELECT pg_size_pretty(pg_database_size(current_database()));
 pg_size_pretty 
----------------
 421 MB

Well, that takes a while, and makes the database a lot bigger (it was 277 MB unindexed). But is it worth it? Let’s find out. Oleg and Teodor’s patch adds support for a nested hstore value on the right-hand-side of the @> operator. In practice, that means we can specify the full path to a nested value as an hstore expression. In our case, to query only for Books, instead of using this expression:

WHERE review #> '{product,group}' = 'Book'

We can use an hstore value with the entire path, including the value:

WHERE review @> '{product => {group => Book}}'

Awesome, right? Let’s give it a try:

reviews=# SELECT
    width_bucket(length(review #> '{product,title}'), 1, 50, 5) title_length_bucket,
    round(avg(review #^> '{review,rating}'), 2) AS review_average,
    count(*)
FROM
    reviews
WHERE
    review @> '{product => {group => Book}}'
GROUP BY
    title_length_bucket
ORDER BY
    title_length_bucket;
 title_length_bucket | review_average | count  
---------------------+----------------+--------
                   1 |           4.42 |  56299
                   2 |           4.33 | 170774
                   3 |           4.45 | 104778
                   4 |           4.41 |  69719
                   5 |           4.36 |  47110
                   6 |           4.43 |  43070
(6 rows)

Time: 849.681 ms

That time looks better than yesterday’s, but in truth I first ran this query just before building the GIN index and got about the same result. Must be that Mavericks is finished indexing my disk or something. At any rate, the index is not buying us much here.

But hey, we’re dealing with 1998 Amazon reviews, so querying against books probably isn’t very selective. I don’t blame the planner for deciding that a table scan is cheaper than an index scan. But what if we try a more selective value, say “DVD”?

reviews=# SELECT
    width_bucket(length(review #> '{product,title}'), 1, 50, 5) title_length_bucket,
    round(avg(review #^> '{review,rating}'), 2) AS review_average,
    count(*)
FROM
    reviews
WHERE
    review @> '{product => {group => DVD}}'
GROUP BY
    title_length_bucket
ORDER BY
    title_length_bucket;
 title_length_bucket | review_average | count 
---------------------+----------------+-------
                   1 |           4.27 |  2646
                   2 |           4.44 |  4180
                   3 |           4.53 |  1996
                   4 |           4.38 |  2294
                   5 |           4.48 |   943
                   6 |           4.42 |   738
(6 rows)

Time: 73.913 ms

Wow! Under 100ms. That’s more like it! Inverted indexing FTW!

Testing Nested hstore

I’ve been helping Oleg Bartunov and Teodor Sigaev with documentation for the forthcoming nested hstore patch for PostgreSQL. It adds support for arrays, numeric and boolean types, and of course arbitrarily nested data structures. This gives it feature parity with JSON, but unlike the JSON type, its values are stored in a binary representation, which makes it much more efficient to query. The support for GiST and GIN indexes to speed up path searches doesn’t hurt, either.

As part of the documentation, we wanted to include a short tutorial, something to show off the schemaless flexibility of the new hstore. The CitusDB guys were kind enough to show off their json_fdw with some Amazon review data in a blog post a few months back; it even includes an interesting query against the data. Let’s see what we can do with it. First, load it:

> createdb reviews
> psql -d reviews -c '
    CREATE EXTENSION HSTORE;
    CREATE TABLE reviews(review hstore);
'
CREATE TABLE
> gzcat customer_reviews_nested_1998.json.gz | sed -e 's/\\/\\\\/g' \
 | sed -e "s/'/''/g" | sed -e 's/":/" =>/g' > /tmp/hstore.copy
> time psql -d reviews -c "COPY reviews FROM '/tmp/hstore.copy'"
COPY 589859
       0.00s user 0.00s system 0% cpu 13.059 total

13 seconds to load 589,859 records from a file – a little over 45k records per second. Not bad. Let’s see what the storage looks like:

> psql -d reviews -c 'SELECT pg_size_pretty(pg_database_size(current_database()));'
 pg_size_pretty 
----------------
 277 MB

The original, uncompressed data is 208 MB on disk, so roughly a third bigger given the overhead of the database. Just for fun, let’s compare it to JSON:

> createdb reviews_js
> psql -d reviews_js -c 'CREATE TABLE reviews(review json);'
CREATE TABLE
> gzcat customer_reviews_nested_1998.json.gz | sed -e 's/\\/\\\\/g' \
 | sed -e "s/'/''/g" | > /tmp/json.copy
> time psql -d reviews_js -c "COPY reviews FROM '/tmp/json.copy'"
COPY 589859
       0.00s user 0.00s system 0% cpu 7.434 total
> psql -d reviews_js -c 'SELECT pg_size_pretty(pg_database_size(current_database()));'
 pg_size_pretty 
----------------
 239 MB

Almost 80K records per second, faster, I’m guessing, because the JSON type doesn’t convert the data to binary representation its way in. JSON currently uses less overhead for storage, aw well; I wonder if that’s the benefit of TOAST storage?

Let’s try querying these guys. I adapted the query from the CitusDB blog post and ran it on my 2013 MacBook Air (1.7 GHz Intel Core i7) with iTunes and a bunch of other apps running in the background [yeah, I’m lazy]). Check out those operators, by the way! Given a path, #^> returns a numeric value:

reviews=# SELECT
    width_bucket(length(review #> '{product,title}'), 1, 50, 5) title_length_bucket,
    round(avg(review #^> '{review,rating}'), 2) AS review_average,
    count(*)
FROM
    reviews
WHERE
    review #> '{product,group}' = 'Book'
GROUP BY
    title_length_bucket
ORDER BY
    title_length_bucket;
 title_length_bucket | review_average | count  
---------------------+----------------+--------
                   1 |           4.42 |  56299
                   2 |           4.33 | 170774
                   3 |           4.45 | 104778
                   4 |           4.41 |  69719
                   5 |           4.36 |  47110
                   6 |           4.43 |  43070
(6 rows)

Time: 2301.620 ms

The benefit of the native type is pretty apparent here. I ran this query several times, and the time was always between 2.3 and 2.4 seconds. The Citus json_fdw query took “about 6 seconds on a 3.1 GHz CPU core.” Let’s see how well the JSON type does (pity there is no operator to fetch a value as numeric; we have to cast from text):

reviews_js=# SELECT
    width_bucket(length(review #>> '{product,title}'), 1, 50, 5) title_length_bucket,
    round(avg((review #>> '{review,rating}')::numeric), 2) AS review_average,
    count(*)
FROM
    reviews
WHERE
    review #>> '{product,group}' = 'Book'
GROUP BY
    title_length_bucket
ORDER BY
    title_length_bucket;
 title_length_bucket | review_average | count  
---------------------+----------------+--------
                   1 |           4.42 |  56299
                   2 |           4.33 | 170774
                   3 |           4.45 | 104778
                   4 |           4.41 |  69719
                   5 |           4.36 |  47110
                   6 |           4.43 |  43070
(6 rows)

Time: 5530.120 ms

A little faster than the json_fdw version, but comparable. But takes well over twice as long as the hstore version, though. For queries, hstore is the clear winner. Yes, you pay up-front for loading and storage, but the payoff at query time is substantial. Ideally, of course, we would have the insert and storage benefits of JSON and the query performance of hstore. There was talk last spring at PGCon of using the same representation for JSON and hstore; perhaps that can still come about.

Meanwhile, I expect to play with some other data sets over the next week; watch this spot for more!

The Power of Enums

Jim Mlodgenski on using Enums in place of references to small lookup tables:

I saw something else I didn’t expect: […] There was a 8% increase in performance. I was expecting the test with the enums to be close to the baseline, but I wasn’t expecting it to be faster. Thinking about it, it makes sense. Enums values are just numbers so we’re effectively using surrogate keys under the covers, but the users would still the the enum labels when they are looking at the data. It ended up being a no brainer to use enums for these static tables. There was a increase in performance while still maintaining the integrity of the data.

I’ve been a big fan of Enums since Andrew and Tom Dunstan released a patch for them during the PostgreSQL 8.2 era. Today they’re a core feature, and as of 9.1, you can even modify their values! You’re missing out if you’re not using them yet.

Understanding Window Functions

Dimitri Fontaine:

There was SQL before window functions and SQL after window functions: that’s how powerful this tool is. Being that of a deal breaker unfortunately means that it can be quite hard to grasp the feature. This article aims at making it crystal clear so that you can begin using it today and are able to reason about it and recognize cases where you want to be using window functions.

Great intro to a powerful feature.

Agile Database Development Tutorial

I gave a tutorial at PGCon a couple weeks back, entitled “Agile Database Development with Git, Sqitch, and pgTAP.” It went well, I think. The Keynote document and an exported PDF have been posted on PGCon.org, and also uploaded here and to Speaker Deck. And embedded below, too. Want to follow along? Clone the tutorial Git repository and follow along. Here’s the teaser:

Hi, I’m David. I like to write database apps. Just as much as I like to write web apps. (Maybe more!) How? Not by relying on bolted-on, half-baked database integration tools like migrations, I’ll tell you that!. Instead, I make extensive use of best-of-breed tools for source control (Git), database unit testing (pgTAP), and database change management and deployment (Sqitch). If you’d like to get as much pleasure out of database development as you do application development, join me for this tutorial. We’ll develop a sample application using the processes and tools I’ve come to depend on, and you’ll find out whether they might work for you. Either way, I promise it will at least be an amusing use of your time.

Looking for the comments? Try the old layout.

Bootstrapping Bucardo Master/Master Replication

Let’s say you have a production database up and running and you want to set up a second database with Bucardo-powered replication between them. Getting a new master up and running without downtime for an existing master, and without losing any data, is a bit fiddly and under-documented. Having just figured out one way to do it with the forthcoming Bucardo 5 code base, I wanted to blog it as much for my own reference as for yours.

First, let’s set up some environment variables to simplify things a bit. I’m assuming that the database names and usernames are the same, and only the host names are different:

export PGDATABASE=widgets
export PGHOST=here.example.com
export PGHOST2=there.example.com
export PGSUPERUSER=postgres

And here are some environment variables we’ll use for Bucardo configuration stuff:

export BUCARDOUSER=bucardo
export BUCARDOPASS=*****
export HERE=here
export THERE=there

First, let’s create the new database as a schema-only copy of the existing database:

createdb -U $PGSUPERUSER -h $PGHOST2 $PGDATABASE
pg_dump -U $PGSUPERUSER -h $PGHOST --schema-only $PGDATABASE \
 | psql -U $PGSUPERUSER -h $PGHOST2 -d $PGDATABASE

You might also have to copy over roles; use pg_dumpall --globals-only to do that.

Next, we configure Bucardo. Start by telling it about the databases:

bucardo add db $HERE$PGDATABASE dbname=$PGDATABASE host=$PGHOST user=$BUCARDOUSER pass=$BUCARDOPASS
bucardo add db $THERE$PGDATABASE dbname=$PGDATABASE host=$PGHOST2 user=$BUCARDOUSER pass=$BUCARDOPASS

Tell it about all the tables we want to replicate:

bucardo add table public.foo public.bar relgroup=myrels db=$HERE$PGDATABASE 

Create a multi-master database group for the two databases:

bucardo add dbgroup mydbs $HERE$PGDATABASE:source $THERE$PGDATABASE:source

And create the sync:

bucardo add sync mysync relgroup=myrels dbs=mydbs autokick=0

Note autokick=0. This ensures that, while deltas are logged, they will not be copied anywhere until we tell Bucardo to do so.

And now that we know that any changes from here on in will be queued for replication, we can go ahead and copy over the data. The only caveat is that we need to disable the Bucardo triggers on the target system, so that our copying does not try to queue up. We do that by setting the session_replication_role GUC to “replica” while doing the copy:

pg_dump -U $PGSUPERUSER -h $PGHOST --data-only -N bucardo $PGDATABASE \
  | PGOPTIONS='-c session_replication_role=replica' \
  | psql -U $PGSUPERUSER -h $PGHOST2 -d $PGDATABASE

Great, now all the data is copied over, we can have Bucardo copy any changes that have been made in the interim, as well as any going forward:

bucardo update sync mysync autokick=1
bucardo reload config

Bucardo will fire up the necessary syncs and copy over any interim deltas. And any changes you make to either system in the future will be copied to the other.

Looking for the comments? Try the old layout.

New in PostgreSQL 9.2: format()

There’s a new feature in PostgreSQL 9.2 that I don’t recall seeing blogged about elsewhere: the format() function. From the docs:

Format a string. This function is similar to the C function sprintf; but only the following conversion specifications are recognized: %s interpolates the corresponding argument as a string; %I escapes its argument as an SQL identifier; %L escapes its argument as an SQL literal; %% outputs a literal %. A conversion can reference an explicit parameter position by preceding the conversion specifier with n$, where n is the argument position.

If you do a lot of dynamic query building in PL/pgSQL functions, you’ll immediately see the value in format(). Consider this function:

CREATE OR REPLACE FUNCTION make_month_partition(
    base_table   TEXT,
    schema_name  TEXT,
    month        TIMESTAMP
) RETURNS VOID LANGUAGE plpgsql AS $_$
DECLARE
    partition TEXT := quote_ident(base_table || '_' || to_char(month, '"y"YYYY"m"MM'));
    month_start TIMESTAMP := date_trunc('month', month);
BEGIN
    EXECUTE '
        CREATE TABLE ' || quote_ident(schema_name) || '.' || partition || ' (CHECK (
                created_at >= ' || quote_literal(month_start) || '
            AND created_at < '  || quote_literal(month_start + '1 month'::interval) || '
        )) INHERITS (' || quote_ident(schema_name) || '.' || base_table || ')
    ';
    EXECUTE 'GRANT SELECT ON ' || quote_ident(schema_name) || '.' || partition || '  TO dude;';
END;
$_$;

Lots of concatenation and use of quote_ident() to get things just right. I don’t know about you, but I always found this sort of thing quite difficult to read. But format() allows use to eliminate most of the operators and function calls. Check it:

CREATE OR REPLACE FUNCTION make_month_partition(
    base_table   TEXT,
    schema_name  TEXT,
    month        TIMESTAMP
) RETURNS VOID LANGUAGE plpgsql AS $_$
DECLARE
    partition TEXT := base_table || '_' || to_char(month, '"y"YYYY"m"MM');
    month_start TIMESTAMP := date_trunc('month', month);
BEGIN
    EXECUTE format(
        'CREATE TABLE %I.%I (
            CHECK (created_at >= %L AND created_at < %L)
        ) INHERITS (%I.%I)',
        schema_name, partition,
        month_start, month_start + '1 month'::interval,
        schema_name, base_table
    );
    EXECUTE format('GRANT SELECT ON %I.%I TO dude', schema_name, partition);
END;
$_$;

I don’t know about you, but I find that a lot easier to read. which means it’ll be easier to maintain. So if you do much dynamic query generation inside the database, give format() a try, I think you’ll find it a winner.

Update 2012-11-16: Okay, so I somehow failed to notice that format() was actually introduced in 9.1 and covered by depesz. D’oh! Well, hopefully my little post will help to get the word out more, at least. Thanks to my commenters.

Looking for the comments? Try the old layout.

Mocking Serialization Failures

I’ve been hacking on the forthcoming Bucardo 5 code base the last couple weeks, as we’re going to start using it pretty extensively at work, and it needed a little love to get it closer to release. The biggest issue I fixed was the handling of serialization failures.

When copying deltas from one database to another, Bucardo sets the transaction isolation to “Serializable”. As of PostgreSQL 9.1, this is true serializable isolation. However, there were no tests for it in Bucardo. And since pervious versions of PostgreSQL had poorer isolation (retained in 9.1 as “Repeatable Read”), I don’t think anyone really noticed it much. As I’m doing all my testing against 9.2, I was getting the serialization failures about half the time I ran the test suite. It took me a good week to chase down the issue. Once I did, I posted to the Bucardo mail list pointing out that Bucardo was not attempting to run a transaction again after failure, and at any rate, the model for how it thought to do so was a little wonky: it let the replicating process die, on the assumption that a new process would pick up where it left off. It did not.

Bucardo maintainer Greg Sabino Mullane proposed that we let the replicating process try again on its own. So I went and made it do that. And then the tests started passing every time. Yay!

Returning to the point of this post, I felt that there ought to be tests for serialization failures in the Bucardo test suite, so that we can ensure that this continues to work. My first thought was to use PL/pgSQL in 8.4 and higher to mock a serialization failure. Observe:

david=# \set VERBOSITY verbose
david=# DO $$BEGIN RAISE EXCEPTION 'Serialization error'
       USING ERRCODE = 'serialization_failure'; END $$;
ERROR:  40001: Serialization error
LOCATION:  exec_stmt_raise, pl_exec.c:2840

Cool, right? Well, the trick is to get this to run on the replication target, but only once. When Bucardo retries, we want it to succeed, thus properly demonstrating the COPY/SERIALIZATION FAIL/ROLLBACK/COPY/SUCCESS pattern. Furthermore, when it copies deltas to a target, Bucardo disables all triggers and rules. So how to get something trigger-like to run on a target table and throw the serialization error?

Studying the Bucardo source code, I discovered that Bucardo itself does not disable triggers and rules. Rather, it sets the session_replica_role GUC to “replica”. This causes PostgreSQL to disable the triggers and rules — except for those that have been set to ENABLE REPLICA. The PostgreSQL ALTER TABLE docs:

The trigger firing mechanism is also affected by the configuration variable session_replication_role. Simply enabled triggers will fire when the replication role is “origin” (the default) or “local”. Triggers configured as ENABLE REPLICA will only fire if the session is in “replica” mode, and triggers configured as ENABLE ALWAYS will fire regardless of the current replication mode.

Well how cool is that? So all I needed to do was plug in a replica trigger and have it throw an exception once but not twice. Via email, Kevin Grittner pointed out that a sequence might work, and indeed it does. Because sequence values are non-transactional, sequences return different values every time they’re access.

Here’s what I came up with:

CREATE SEQUENCE serial_seq;

CREATE OR REPLACE FUNCTION mock_serial_fail(
) RETURNS trigger LANGUAGE plpgsql AS $_$
BEGIN
    IF nextval('serial_seq') % 2 = 0 THEN RETURN NEW; END IF;
    RAISE EXCEPTION 'Serialization error'
            USING ERRCODE = 'serialization_failure';
END;
$_$;

CREATE TRIGGER mock_serial_fail AFTER INSERT ON bucardo_test2
    FOR EACH ROW EXECUTE PROCEDURE mock_serial_fail();
ALTER TABLE bucardo_test2 ENABLE REPLICA TRIGGER mock_serial_fail;

The first INSERT (or, in Bucardo’s case, COPY) to bucardo_test2 will die with the serialization error. The second INSERT (or COPY) succeeds. This worked great, and I was able to write test in a few hours and get them committed. And now we can be reasonably sure that Bucardo will always properly handle serialization failures.

Looking for the comments? Try the old layout.

Always Use TIMESTAMP WITH TIME ZONE

My recommendations for sane time zone management in PostgreSQL:

  • Set timezone = 'UTC' in postgresq.conf. This makes UTC the default time zone for all connections.
  • Use timestamp with time zone (aka timestamptz) and time with time zone (aka timetz). They store values as UTC, but convert them on selection to whatever your time zone setting is.
  • Avoid timestamp without time zone (aka timestamp) and time without time zone (aka time). These columns do not know the time zone of a value, so different apps can insert values in different zones no one would ever know.
  • Always specify a time zone when inserting into a timestamptz or timetz column. Unless the zone is UTC. But even then, append a “Z” to your value: it’s more explicit, and will keep you sane.
  • If you need to get timestamptz or timetz values in a zone other than UTC, use the AT TIME ZONE expression in your query. But be aware that the returned value will be a timestamp or time value, with no more time zone. Good for reporting and queries, bad for storage.
  • If your app always needs data in some other time zone, have it SET timezone = 'UTC' on connection. All values then retrieved from the database will be in the configured time zone. The app should still include the time zone in values sent to the database.

The one exception to the rule preferring timestamptz and timetz is a special case: partitioning. When partitioning data on timestamps, you must not use timestamptz. Why? Because almost no expression involving timestamptz comparison is immutable. Use one in a WHERE clause, and constraint exclusion may well be ignored and all partitions scanned. This is usually something you want to avoid.

So in this one case and only in this one case, use a timestamp without time zone column, but always insert data in UTC. This will keep things consistent with the timestamptz columns you have everywhere else in your database. Unless your app changes the value of the timestamp GUC when it connects, it can just assume that everything is always UTC, and should always send updates as UTC.

Looking for the comments? Try the old layout.

iovationeering

Since June, as part of my work for PGX, I’ve been doing on-site full-time consulting for iovation here in Portland. iovation is in the business of deterring online fraud via device identification and reputation. Given the nature of that business, a whole lot of data arrives every day, and I’ve been developing PostgreSQL-based solutions to help get a handle on it. The work has been truly engaging, and a whole hell of a lot of fun. And there are some really great, very smart people at iovation, whom I very much like and respect.

iovation

So much so, in fact, that I decided to accept their offer of a full time position as “Senior Data Architect.” I started on Monday.

I know, crazy, right? They’ve actually been talking me up about it for a long time. In our initial contact close to two years ago, as I sought to land them as a PGX client, they told me they wanted to hire someone, and was I interested. I said “no.” I said “no” through four months of contracting this summer and fall, until one day last month I said to myself, “wait, why don’t I want this job?” I had been on automatic, habitually insisting I wasn’t interested in a W2 position. And with good reason. Aside from 15 months as CTO at values of n (during which time I worked relatively independently anyway), I’ve been an independent consultant since I founded Kineticode in November of 2001. Yeah. Ten Years.

Don’t get me wrong, those ten years have been great! Not only have I been able to support myself doing the things I love—and learned a ton in the process—but I’ve managed to write a lot of great code. Hell, I will be continuing as an associate with PGX, though with greatly reduced responsibilities. And someday I may go indy again. But in the meantime, the challenges, opportunities, and culture at iovation are just too good to pass up. I’m loving the work I’m doing there, and expect to learn a lot over the next few years.

Kineticode

So what, you might ask, does this mean for Kineticode, the company I founded to offer support, consulting, and training services for Bricolage CMS? The truth is that Kineticode has only a few technical support customers left; virtually all of my work for the last two years has been through PGX. So I’ve decided to shut Kineticode down. I’m shifting the Bricolage tech support offerings over to PGX and having Kineticode’s customers move there as their contacts come up for renewal. They can expect the same great service as always. Better even, as there are 10 associates in PGX, and, lately, only me at Kineticode. Since Kineticode itself is losing its Raison d’être, it’s going away.

PGX

I intend to remain involved in the various open-source projects I work on. I still function as the benevolent dictator of Bricolage CMS, though other folks have stepped up their involvement quite a lot in the last few years. And I expect to keep improving [PGXN] and DesignScene as time allows (I’ve actually been putting some effort into both in the last few weeks; watch for PGXN and Lunar/Theory announcements in the coming weeks and months!). And, in fact, I expect that a fair amount of the work I do at iovation will lead to blog posts, conference presentations, and more open-source code.

This is going to be a blast. Interested in a front-row seat? Follow me on Twitter.

Looking for the comments? Try the old layout.

DBIx::Connector and Serializable Snapshot Isolation

I was at Postgres Open week before last. This was a great conference, very welcoming atmosphere and lots of great talks. One of the more significant, for me, was the session on serializable transactions by Kevin Grittner, who developed SSI for PostgreSQL 9.1. I hadn’t paid much attention to this feature before now, but it became clear to me, during the talk, that it’s time.

So what is SSI? Well, serializable transactions are almost certainly how you think of transactions already. Here’s how Kevin describes them:

True serializable transactions can simplify software development. Because any transaction which will do the right thing if it is the only transaction running will also do the right thing in any mix of serializable transactions, the programmer need not understand and guard against all possible conflicts. If this feature is used consistently, there is no need to ever take an explicit lock or SELECT FOR UPDATE/SHARE.

This is, in fact, generally how I’ve thought about transactions. But I’ve certainly run into cases where it wasn’t true. Back in 2006, I wrote an article on managing many-to-many relationships with PL/pgSQL which demonstrated a race condition one might commonly find when using an ORM. The solution I offered was to always use a PL/pgSQL function that does the work, and that function executes a SELECT...FOR UPDATE statement to overcome the race condition. This creates a lock that forces conflicting transactions to be performed serially.

Naturally, this is something one would rather not have to think about. Hence SSI. When you identify a transaction as serializable, it will be executed in a truly serializable fashion. So I could actually do away with the SELECT...FOR UPDATE workaround — not to mention any other race conditions I might have missed — simply by telling PostgreSQL to enforce transaction isolation. This essentially eliminates the possibility of unexpected side-effects.

This comes at a cost, however. Not in terms of performance so much, since the SSI implementation uses some fancy, recently-developed algorithms to keep things efficient. (Kevin tells me via IRC: “Usually the rollback and retry work is the bulk of the additional cost in an SSI load, in my testing so far. A synthetic load to really stress the LW locking, with a fully-cached database doing short read-only transactions will have no serialization failures, but can run up some CPU time in LW lock contention.”) No, the cost is actually in increased chance of transaction rollback. Because SSI will catch more transaction conflicts than the traditional “read committed” isolation level, frameworks that expect to work with SSI need to be prepared to handle more transaction failures. From the fine manual:

The Serializable isolation level provides the strictest transaction isolation. This level emulates serial transaction execution, as if transactions had been executed one after another, serially, rather than concurrently. However, like the Repeatable Read level, applications using this level must be prepared to retry transactions due to serialization failures.

And that brings me to DBIx::Connector, my Perl module for safe connection and transaction management. It currently has no such retry smarts built into it. The feature closest to that is the “fixup” connection mode, wherein if a execution of a code block fails due to a connection failure, DBIx::Connector will re-connect to the database and execute the code reference again.

I think I should extend DBIx::Connector to take isolation failures and deadlocks into account. That is, fixup mode would retry a code block not only on connection failure but also on serialization failure (SQLSTATE 40001) and deadlocks (SQLSTATE 40P01). I would also add a new attribute, retries, to specify the number of times to retry such execution, with a default of three (which likely will cover the vast majority of cases). This has actually been an oft-requested feature, and I’m glad to have a new reason to add it.

There are a few design issues to overcome, however:

  • Fixup mode is supported not just by txn(), which scopes the execution of a code reference to a single transaction, but also run(), which does no transaction handling. Should the new retry support be added there, too? I could see it either way (a single SQL statement executed in run() is implicitly transaction-scoped).
  • Fixup mode is also supported by svp(), which scopes the execution of a code reference to a savepoint (a.k.a. a subtransaction). Should the rollback and retry be supported there, too, or would the whole transaction have to be retried? I’m thinking the latter, since that’s currently the behavior for connection failures.
  • Given these issues, will it make more sense to perhaps create a new mode? Maybe it would be supported only by txn().

This is do-able, will likely just take some experimentation to figure it out and settle on the appropriate API. I’ll need to find the tuits for that soon.

In the meantime, given currently in-progress changes, I’ve just released a new version of DBIx::Connector with a single change: All uses of the deprecated catch syntax now throw warnings. The previous version threw warnings only the first time the syntax was used in a particular context, to keep error logs from getting clogged up. Hopefully most folks have changed their code in the two months since the previous release and switched to Try::Tiny or some other model for exception handling. The catch syntax will be completely removed in the next release of DBIx::Connector, likely around the end of the year. Hopefully the new SSI-aware retry functionality will have been integrated by then, too.

In a future post I’ll likely chew over whether or not to add an API to set the transaction isolation level within a call to txn() and friends.

Looking for the comments? Try the old layout.