Just a Theory

By David E. Wheeler

Posts about Testing

Introducing MyTAP

I gave my OSCON tutorial (slides) last week. It went okay. I spent way too much time helping to get everyone set up with pgTAP, and then didn’t have time to have the attendees do the exercises, and I had to rush through 2.5 hours of material in 1.5 hours. Yikes! At least the video will be better when it’s released (more when that happens).

But as often happens, I was asked whether something like pgTAP exists for MySQL. But this time I was asked by MySQL Community Manager Giuseppe Maxia, who also said that he’d tried to create a test framework himself (a fellow Perl hacker!), but that it wasn’t as nice as pgTAP. Well, since I was at OSCON and tend to like to hack on side projects while at conferences, and since I hoped that Giuseppe will happily take it over once I’ve implemented the core, I started hacking on it myself. And today, I’m pleased to announce the release of MyTAP 0.01 (downloads).

Once you’ve downloaded it, install it against your MySQL server like so:

mysql -u root < mytap.sql

Here’s a very simple example script:

-- Start a transaction.
BEGIN;

-- Plan the tests.
SELECT tap.plan(1);

-- Run the tests.
SELECT tap.pass( 'My test passed, w00t!' );

-- Finish the tests and clean up.
CALL tap.finish();
ROLLBACK;

You can run this test from a .sql file using the mysql client like so:

mysql -u root --disable-pager --batch --raw --skip-column-names --unbuffered --database try --execute 'source test.sql'

But that’s a PITA and can only run one test at a time. Instead, put all of your tests into a directory, perhaps named tests, each with the suffix “.my”, and use my_prove (install TAP::Parser::SourceHandler::MyTAP from CPAN to get it) instead:

my_prove -u root --database try tests/

For MyTAP’s own tests, the output looks like this:

tests/eq.my ........ ok
tests/hastap.my .... ok
tests/matching.my .. ok
tests/moretap.my ... ok
tests/todotap.my ... ok
tests/utils.my ..... ok
All tests successful.
Files=6, Tests=137,  1 wallclock secs
(0.06 usr  0.03 sys +  0.01 cusr  0.02 csys =  0.12 CPU)
Result: PASS

Nice, eh? Of course there are quite a few more assertion functions. See the complete documentation for details.

Now, I did my best to keep the interface the same as pgTAP, but there are a few differences:

  • MySQL temporary tables are teh suck, so I had to use permanent tables to track test state. To make this more feasible, MyTAP is always installed in its own database, (named “tap” by default), and you must always schema-qualify your use of the MyTAP functions.
  • Another side-effect of permanent tables is that MyTAP must keep track of test outcomes without colliding with the state from tests running in multiple concurrent connections. So MyTAP uses connection_id() to keep track of state for a single test run. It also deletes the state when tests finish(), but if there’s a crash before then, data can be left in those tables. If the connection ID is ever re-used, this can lead to conflicts. This seems mostly avoidable by using InnoDB tables and transactions in the tests.
  • The word “is” is strictly reserved by MySQL, so the function that corresponds to pgTAP’s is() is eq() in MyTAP. Similarly, isnt() is called not_eq() in MyTAP.
  • There is no way to throw an exception in MySQL functions an procedures, so the code cheats by instead performing an illegal operation: selecting from a non-existent column, where the name of that column is the error message. Hinky, but should get the point across.

Other than these issues, things went fairly smoothly. I finished up the 0.01 version last night and released it today with most of the core functionality in place. And now I want to find others to take over, as I am not a MySQL hacker myself and thus unlikely ever to use it. If you’re interested, my recommendations for things to do next are:

So fork on GitHub or contact me if you’d like to be added as a collaborator (I’m looking at you, Giuseppe!).

Hope you find it useful.

Looking for the comments? Try the old layout.

JPUG Talk Posted

No Perl content today, I’m afraid. I’m just back from my trip to Japan and wanted to post this very nice video of my talk [Update 2018: Sadly gone now]. Unlike the versions from other conferences, this one focuses primarily on the slides, with me appearing in audio only. This makes it really easy to follow. Enjoy.

Looking for the comments? Try the old layout.

More about…

Test Everything with TAP Source Handlers

I’ve just arrived in Japan with my family. We’re going to be spending several days in Tokyo, during which time I’ll be at the JPUG 10th Anniversary PostgreSQL Conference for a couple of days (giving the usual talk), but mainly I’ll be on vacation. We’ll be visiting Kyoto, too. We’re really excited about this trip; it’ll be a great experience for Anna. I’ll be back in the saddle in December, so for those of you anxiously awaiting the next installment of my Catalyst tutorial, I’m afraid you’ll have to wait a bit longer.

In the meantime, I wanted to write about a little something that’s been cooking for a while. Over the last several months, Steve Purkis has been working on a new feature for TAP::Parser: source handlers. The idea is to make it easier for developers to add support for TAP emitters other than Perl. The existing implementation did a decent job of handling Perl test scripts, of course, and executable files (useful for compiled tests in C using libtap, for example), but anything else was difficult.

As the author of pgTAP, I was of course greatly interested in this work, because I had to bend over backwards to get pg_prove to work nicely. It’s even uglier to get a Module::Build-based distribution to run pgTAP and Perl tests all at once in during ./Build test: You had to subclass Module::Build to do it.

Steve wanted to solve this problem, and he did. Then he was kind enough to listen to my bitching an moaning and rewrite his fix so that it was simpler for third parties (read: me) to add new source handlers. What’s a source handler, you ask? Check out the latest dev release of Test::Harness and you’ll see it: TAP::Parser::SourceHandler. As soon as Steve committed it, I jumped in and implemented a new handler for pgTAP. The cool thing is that it took me only three hours to do, including tests. And here’s how you use it in a Build.PL, so that you can have pgTAP tests named *.pg run at the same time as your *.t Perl tests:

Module::Build->new(
    module_name        => 'MyApp',
    test_file_exts     => [qw(.t .pg)],
    use_tap_harness    => 1,
    tap_harness_args   => {
        sources => {
            Perl  => undef,
            pgTAP => {
                dbname   => 'try',
                username => 'postgres',
                suffix   => '.pg',
            },
        }
    },
    build_requires     => {
        'Module::Build'                      => '0.30',
        'TAP::Parser::SourceHandler::pgTAP' => '3.19',
    },
)->create_build_script;

To summarize, you just have to:

  • Tell Module::Build the extensions of your test scripts (that’s qw(.t .pg) here)
  • Specify the Perl source with its defaults (that’s what the undef does)
  • Specify the pgTAP options (database name, username, suffix, and lots of other potential settings)

And that’s it. You’re done! Run your tests with the usual incantation:

perl Build.PL
./Build test

You can use pgTAP and its options with prove, too, via the --source and --pgtap-option options:

prove --source pgTAP --pgtap-option dbname=try \
                     --pgtap-option username=postgres \
                     --pgtap-option suffix=.pg \
                     t/sometest.pg

It’s great that it’s now so much easier to support pgTAP tests, but what if you want to have Ruby tests? Or PHP? Well, it’s a simple process to write your own source handler. Here’s how:

  • Subclass TAP::Parser::SourceHandler. The final part of the package name is the name of the source. Thus if you wrote TAP::Parser::SourceHandler::Ruby, the name of your source would be “Ruby”.

  • Load the necessary modules and register your source handler. For a Ruby source handler, it might look like this:

    package TAP::Parser::SourceHandler::Ruby;
    use strict;
    use warnings;
    
    use TAP::Parser::IteratorFactory   ();
    use TAP::Parser::Iterator::Process ();
    TAP::Parser::IteratorFactory->register_handler(__PACKAGE__);
  • Implement the can_handle() method. The task of this method is to return a score between 0 and 1 for how likely it is that your source handler can handle a given source. A bunch of information is passed in a hash to the method, so you can check it all out. For example, if you wanted to run Ruby tests ending in .rb, you might write something like this:

    sub can_handle {
        my ( $class, $source ) = @_;
        my $meta = $source->meta;
    
        # If it's not a file (test script), we're not interested.
        return 0 unless $meta->{is_file};
    
        # Get the file suffix, if any.
        my $suf = $meta->{file}{lc_ext};
    
        # If the config specifies a suffix, it's required.
        if ( my $config = $source->config_for('Ruby') ) {
            if ( defined $config->{suffix} ) {
                # Return 1 for a perfect score.
                return $suf eq $config->{suffix} ? 1 : 0;
            }
        }
    
        # Otherwise, return a score for our supported suffix.
        return $suf eq '.rb' ? 0.8 : 0;
    }

    The last line is the most important: it returns 0.8 if the suffix is .rb, saying that it’s likely that this handler can handle the test. But the middle bit is interesting, too. The $source->config_for('Ruby') call is seeing if the user specified a suffix, either via the command-line or in the options. So in a Build.PL, that might be:

        tap_harness_args => {
            sources => {
                Perl => undef,
                Ruby => { suffix => '.rub' },
            }
        },

    Meaning that the user wanted to run tests ending in .rub as Ruby tests. It can also be done on the command-line with prove:

    prove --source Ruby --ruby-option suffix=.rub
    

    Cool, eh? We have a reasonable default for Ruby tests, .rb, but the user can override however she likes.

  • And finally, implement the make_iterator() method. The job of this method is simply to create a TAP::Parser::Iterator object to actually run the test. It might look something like this:

    sub make_iterator {
        my ( $class, $source ) = @_;
        my $config = $source->config_for('Ruby');
    
        my $fn = ref $source->raw ? ${ $source->raw } : $source->raw;
        $class->_croak(
            'No such file or directory: ' . defined $fn ? $fn : ''
        ) unless $fn && -e $fn;
    
        return TAP::Parser::Iterator::Process->new({
            command => [$config->{ruby} || 'ruby', $fn ],
            merge   => $source->merge
        });
    }

    Simple, right? Just make sure we have a valid file to execute, then instantiate and return a TAP::Parser::Iterator::Process object to actually run the test.

That’s it. Just two methods and you’re ready to go. I’ve even added support for a suffix option and a ruby option (so that you can point to the ruby executable in case it’s not in your path). Using it is easy. I wrote a quick TAP-emitting Ruby script like so:

puts 'ok 1 - This is a test'
puts 'ok 2 - This is another test'
puts 'not ok 3 - This is a failed test'

And to run this test (assuming that TAP::Parser::SourceHandler::Ruby has been installed somewhere where Perl can find it), it’s just:

% prove --source Ruby ~/try.rb --verbose
/Users/david/try.rb .. 
ok 1 - This is a test
ok 2 - This is another test
not ok 3 - This is a failed test
Failed 1/3 subtests 

Test Summary Report
-------------------
/Users/david/try.rb (Wstat: 0 Tests: 3 Failed: 1)
  Failed test:  3
  Parse errors: No plan found in TAP output
Files=1, Tests=3,  0 wallclock secs ( 0.02 usr +  0.01 sys =  0.03 CPU)
Result: FAIL

It’s so easy to create new source handlers now, especially if all you have to do is support a new dynamic language. I’ve put the simple Ruby example over here; feel free to take it and run with it!

Looking for the comments? Try the old layout.

Testing Catalyst Template::Declare Views

Now that we have our default Catalyst tests passing, let’s have a look at testing the views we’ve created. You can follow along via the Part 6 tag tag in the GitHub repository. Start by looking at the default test script for our HTML view, t/view_HTML.t. It should look something like this:

use strict;
use warnings;
use Test::More tests => 3;
# use Test::XPath;

BEGIN {
    use_ok 'MyApp::View::HTML' or die;
    use_ok 'MyApp' or die;
}

ok my $view = MyApp->view('HTML'), 'Get HTML view object';

# ok my $output = $view->render(undef, 'hello', { user => 'Theory' }),
#     'Render the "hello" template';

# Test output using Test::XPath or similar.
# my $tx = Test::XPath->new( xml => $output, is_html => 1);
# $tx->ok('/html', 'Should have root html element');
# $tx->is('/html/head/title', 'Hello, Theory', 'Title should be correct');

Yeah, this looks a bit different that the view test created for Template Toolkit or Mason views. That’s because Catalyst::View::TD ships with its own test script template. One of the advantage is that it shows off testing the view without having to instantiate the entire app or send mock HTTP requests. These are unit tests, after all: we want to make sure that the view templates do what they want, not test an entire request process. The latter is more appropriate for integration tests, which I’ll cover later.

So let’s have a look at this test script. The first commented-out statement is:

# ok my $output = $view->render(undef, 'hello', { user => 'Theory' }),
#     'Render the "hello" template';

What this is showing us is that one can use the view’s render() method to execute a view without a context object, thus saving the expense of initializing the application. And if you have templates that don’t rely on it, I highly recommend this approach for keeping your tests fast. Even if the use of the the context object is fairly minimal, you can use Test::MockObject to mock up a context object like so:

use Test::MockObject;
my $c = Test::MockObject->new;
$c->mock(uri_for => sub { $_[1] });
$c->mock(config  => sub { { name => 'MyApp' } });
$c->mock(debug   => sub { });
$c->mock(log     => sub { });

ok my $output = $view->render($c, 'hello', { user => 'Theory' }),
    'Render the "hello" template';

Then you can use the mock() method to mock more methods as your template uses them.

Alas, our app has already passed the point where that seems worthwhile. So far we have just one template, books/list, and it requires that there also be a database statement handle available. Sure we could create a database connection and prepare a statement handle. But that would start to require a fair bit more code to set up. So let’s just instantiate the application object and be done with it. Change the test plan to 5:

use Test::More tests => 5;

Then change the test body after the BEGIN block to:

# Instantiate the context object and the view.
ok my $c = MyApp->new, 'Create context object';
ok my $view = $c->view('HTML'), 'Get HTML view object';

# Create a statement handle for books/list.
my $sth = $c->conn->run(sub { $_->prepare(q{
    SELECT isbn, title, rating, authors FROM books_with_authors
}) });
$sth->execute;

# Render books/list.
ok my $output = $view->render($c, 'books/list', {
    title => 'Book List',
    books => $sth,
}), 'Render the "books/list" template';

This allows us to get a full test of the view.

% prove --lib --verbose t/view_HTML.t
t/view_HTML.t .. 
1..5
ok 1 - use MyApp::View::HTML;
ok 2 - use MyApp;
ok 3 - Create context object
ok 4 - Get HTML view object
Explicit blessing to '' (assuming package main) at /usr/local/lib/perl5/site_perl/5.10.1/Catalyst.pm line 1281.
Explicit blessing to '' (assuming package main) at /usr/local/lib/perl5/site_perl/5.10.1/Catalyst.pm line 1281.
Explicit blessing to '' (assuming package main) at /usr/local/lib/perl5/site_perl/5.10.1/Catalyst.pm line 1281.
Explicit blessing to '' (assuming package main) at /usr/local/lib/perl5/site_perl/5.10.1/Catalyst.pm line 1281.
ok 5 - Render the "books/list" template
ok
All tests successful.
Files=1, Tests=5,  1 wallclock secs ( 0.02 usr  0.00 sys +  0.69 cusr  0.06 csys =  0.77 CPU)
Result: PASS

Hrm. Those warnings are rather annoying. Looking at Catalyst.pm I see that they come from the uri_for() method. I expect that they somehow result from a lack of state in the context object. That’s not really important for our unit tests, so let’s just mock that one method to do something reasonable. Add this code after instantiating the context object but before rendering the view:

use Test::MockObject::Extends;
my $mocker = Test::MockObject::Extends->new($c);
$mocker->mock( uri_for => sub { $_[1]} );

And now we get:

% prove --lib --verbose t/view_HTML.t
t/view_HTML.t .. 
1..5
ok 1 - use MyApp::View::HTML;
ok 2 - use MyApp;
ok 3 - Create context object
ok 4 - Get HTML view object
ok 5 - Render the "books/list" template
ok
All tests successful.
Files=1, Tests=5,  1 wallclock secs ( 0.02 usr  0.01 sys +  0.77 cusr  0.07 csys =  0.87 CPU)
Result: PASS

Ah, much better! And thanks to our mock, we also have a much better idea of what will be returned from uri_for(), which will be important for later tests.

Now that we have things properly mocked up and the objects created such that we can actually get the template to render, it’s time to test the output from the template. For HTML and XML format, I like the Test::XPath module. In fact, it’s for this very use that I wrote Test::XPath. It’s great because it allows me to effectively test the correctness of the template output. Here’s the basic outline:

# Test output using Test::XPath.
my $tx = Test::XPath->new( xml => $output, is_html => 1);
test_basics($tx, 'Book List');

# Call this function for every request to make sure that they all
# have the same basic structure.
sub test_basics {
    my ($tx, $title) = @_;

    # Some basic sanity-checking.
    $tx->is( 'count(/html)',      1, 'Should have 1 html element' );
    $tx->is( 'count(/html/head)', 1, 'Should have 1 head element' );
    $tx->is( 'count(/html/body)', 1, 'Should have 1 body element' );

    # Check the head element.
    $tx->is(
        '/html/head/title',
        $title,
        'Title should be corect'
    );
    $tx->is(
        '/html/head/link[@type="text/css"][@rel="stylesheet"]/@href',
        '/static/css/main.css',
        'Should load the CSS',
    );
}

I’ve set up the test_basics() function to test the things that should be mostly the same for every request. This will mainly cover the output of the wrapper, and includes things like making sure that there is just one <html> tag, one <head> tag, and one <body> tag; and that the title and CSS-related elements are output properly. Running this (with the test plan set to no_plan as I develop), I get:

% prove --lib t/view_HTML.tt
t/view_HTML.t .. 2/? 
#   Failed test 'Should load the CSS'
#   at t/view_HTML.t line 52.
#          got: ''
#     expected: '/static/css/main.css'
# Looks like you failed 1 test of 10.
t/view_HTML.t .. Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/10 subtests 

Test Summary Report
-------------------
t/view_HTML.t (Wstat: 256 Tests: 10 Failed: 1)
  Failed test:  10
  Non-zero exit status: 1
Files=1, Tests=10,  1 wallclock secs ( 0.02 usr  0.01 sys +  0.79 cusr  0.08 csys =  0.90 CPU)
Result: FAIL

Hrm. Let’s stick a diag $output in there and see what we get. Now the output includes this bit:

# <html>
#  <head>
#   <title>Book List</title>
#   <link rel="stylesheet" href="/static/css/main.css" />
#  </head>

Ah! the <link> element for the stylesheet is missing the type attribute. So let’s add it. Edit lib/MyApp/Templates/HTML.pm and change the proper bit of the wrapper template to:

link {
    rel is 'stylesheet';
    type is 'text/css';
    href is $c->uri_for('/static/css/main.css' );
};

Note the addition of the type attribute. Now when we run the tests (removing the diag), we get:

% prove --lib t/view_HTML.t
t/view_HTML.t .. ok    
All tests successful.
Files=1, Tests=10,  1 wallclock secs ( 0.02 usr  0.00 sys +  0.78 cusr  0.07 csys =  0.87 CPU)
Result: PASS

Ah, much better! A lot more testing should go in there to make sure that the wrapper is doing things right. I’ve committed such testing, so check it out.

Now we need to test the output specific to the books/list template. Below the call to test_basics(), add this code:

$tx->ok('/html/body/div[@id="bodyblock"]/div[@id="content"]/table', sub {
    $_->is('count(./tr)', 6, 'Should have seven rows' );
    $_->ok('./tr[1]', sub {
        $_->is('count(./th)', 3, 'Should have three table headers');
        $_->is('./th[1]', 'Title', '... first is "Title"');
        $_->is('./th[2]', 'Rating', '... second is "Rating"');
        $_->is('./th[3]', 'Authors', '... third is "Authors"');
    }, 'Should have first table row')
}, 'Should have a table');

Notice the nested block there? Test::XPath supports passing blocks to its ok() method, so that you can naturally scope your tests to blocks of XML and HTML. Neat, huh? If you don’t like the use of $_, the test object is also passed as the sole argument to such blocks.

Anyway, these tests makes sure that the table is where it should be, has the proper number of rows, and that the first row has three headers with their proper values. The test outputs:

% prove --lib t/view_HTML.tt
t/view_HTML.t .. 1/? 
#   Failed test '... third is "Authors"'
#   at t/view_HTML.t line 42.
#          got: 'Author'
#     expected: 'Authors'
# Looks like you failed 1 test of 28.
t/view_HTML.t .. Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/28 subtests 

Test Summary Report
-------------------
t/view_HTML.t (Wstat: 256 Tests: 28 Failed: 1)
  Failed test:  28
  Non-zero exit status: 1
Files=1, Tests=28,  1 wallclock secs ( 0.03 usr  0.01 sys +  0.79 cusr  0.08 csys =  0.91 CPU)
Result: FAIL

Whoops! Looks like I forgot to change the header when I changed the template to output a list of authors last week. So edit lib/MyApp/Templates/HTML/Books.pm and change the template to output “Authors” instead of “Author”:

row {
    th { 'Title'   };
    th { 'Rating'  };
    th { 'Authors' };
};

And now all tests pass again:

% prove --lib t/view_HTML.t
t/view_HTML.t .. ok    
All tests successful.
Files=1, Tests=28,  1 wallclock secs ( 0.02 usr  0.01 sys +  0.78 cusr  0.09 csys =  0.90 CPU)
Result: PASS

Great. So let’s finish testing the rest of the output. Ah, but wait! We have on ORDER BY clause on the query, so the order in which the books will be output is undefined. So let’s add an ORDER BY clause. Change the creation of the statement handle in the test file to:

my $sth = $c->conn->run(sub { $_->prepare(q{
    SELECT isbn, title, rating, authors
        FROM books_with_authors
        ORDER BY title
}) });

And now you can start to see why I use the q{} operator for SQL queries. You should also note that the inputs for the view test are now different than those from the controller, which still has no ORDER BY clause. It’s likely that we’ll want to go back and change that later, but I bring it up here to highlight the difference from integration tests – and to emphasize that we’ll need to write those integration tests at some point!

But back to the view unit tests. We can now test the contents of the table by adding code after the test for ./tr[1]. Here’s what the test for the next row looks like:

$_->ok('./tr[2]', sub {
    $_->is('count(./td)', 3, 'Should have three cells');
    $_->is(
        './td[1]',
        'CCSP SNRS Exam Certification Guide',
        '... first is "CCSP SNRS Exam Certification Guide"'
    );
    $_->is('./td[2]', 5, '... second is "5"');
    $_->is(
        './td[3]',
        'Bastien, Nasseh, Degu',
        '... third is "Bastien, Nasseh, Degu"',
    );
}, 'Should have second table row');

The other rows can be similarly tested; have a look at the commit to see all the new tests.

This reminds me, however, that we never created an order for the list of authors. So it’s possible that this test could fail, as the order of the author last names is undefined. We should go back and fix that, probably by listing the authors as they are actually listed on the cover of the book. But in the meantime, our test of this view is done.

Next up, I think I’ll hit controller tests. So come on back!

Looking for the comments? Try the old layout.

Testing the Tutorial App

Yet another entry in my ongoing attempt to rewrite the Catalyst tutorial in my own coding style.

So far, I’ve been following the original tutorial pretty closely. But now I want to skip ahead a bit to chapter 8: testing. I skip because, really, we should be writing tests from the very beginning. They shouldn’t be an afterthought stuck in the penultimate chapter of a tutorial. So let’s write some tests. You can follow along in the Part 5 tag in the GitHub repository.

Oops, A Missing Dependency

Oh, wait! I forgot to tell the build system that we now depend on Catalyst::View::TD and DBIx::Connector. So add these two lines to Makefile.PL:

requires 'Catalyst::View::TD' => '0.11';
requires 'DBIx::Connector' => '0.30';

Okay, now we can write some tests.

STFU

Well, no, actually, let’s start by running the tests we have:

perl Makefile.PL
make test

You should see some output after this — lots of stuff, actually — ending something like this:

[debug] Loaded Path actions:
.-------------------------------------+--------------------------------------.
| Path                                | Private                              |
+-------------------------------------+--------------------------------------+
| /                                   | /index                               |
| /                                   | /default                             |
| /books                              | /books/index                         |
| /books/list                         | /books/list                          |
'-------------------------------------+--------------------------------------'

[info] MyApp powered by Catalyst 5.80013
t/view_HTML.t ......... ok   
All tests successful.
Files=5, Tests=8,  3 wallclock secs ( 0.04 usr  0.02 sys +  2.19 cusr  0.25 csys =  2.50 CPU)
Result: PASS

I don’t know about you, but having all that debugging crap just drives me nuts while I’m running tests. It’s helpful while doing development, but mainly just gets in the way of the tests. So let’s get rid of them. Open up lib/MyApp.pm and change the use Catalyst statement to:

use Catalyst (qw(
    ConfigLoader
    Static::Simple
    StackTrace
), $ENV{HARNESS_ACTIVE} ? () : '-Debug');

Essentially, we’re just turning on the debugging output only if the test harness is not active. Now when we run the tests, we get:

t/01app.t ............. ok   
t/02pod.t ............. skipped: set TEST_POD to enable this test
t/03podcoverage.t ..... skipped: set TEST_POD to enable this test
t/controller_Books.t .. ok   
t/view_HTML.t ......... ok   
All tests successful.
Files=5, Tests=8,  3 wallclock secs ( 0.04 usr  0.02 sys +  2.15 cusr  0.23 csys =  2.44 CPU)
Result: PASS

Much better. Now I can actually see other stuff, such as the fact that I’m skipping POD tests. Personally, I like to make sure that POD tests run all the time, as I’m likely to forget to set the environment variable. So let’s edit t/02pod.t and t/03podcoverage.t and delete this line from each:

plan skip_all => 'set TEST_POD to enable this test' unless $ENV{TEST_POD};

So what does that get us?

t/01app.t ............. ok   
t/02pod.t ............. ok     
t/03podcoverage.t ..... 1/6 
#   Failed test 'Pod coverage on MyApp::Controller::Books'
#   at /usr/local/lib/perl5/site_perl/5.10.1/Test/Pod/Coverage.pm line 126.
# Coverage for MyApp::Controller::Books is 50.0%, with 1 naked subroutine:
#   list

#   Failed test 'Pod coverage on MyApp::Controller::Root'
#   at /usr/local/lib/perl5/site_perl/5.10.1/Test/Pod/Coverage.pm line 126.
# Coverage for MyApp::Controller::Root is 66.7%, with 1 naked subroutine:
#   default
# Looks like you failed 2 tests of 6.
t/03podcoverage.t ..... Dubious, test returned 2 (wstat 512, 0x200)
Failed 2/6 subtests 
t/controller_Books.t .. ok   
t/view_HTML.t ......... ok   

Test Summary Report
-------------------
t/03podcoverage.t   (Wstat: 512 Tests: 6 Failed: 2)
  Failed tests:  2-3
  Non-zero exit status: 2
Files=5, Tests=25,  3 wallclock secs ( 0.05 usr  0.02 sys +  2.82 cusr  0.29 csys =  3.18 CPU)
Result: FAIL
Failed 1/5 test programs. 2/25 subtests failed.

Well that figures, doesn’t it? We added the list action to MyApp::Controller Books but never documented it. And for some reason, Catalyst creates the default action in MyApp::Controller::Root with no documentation. Such a shame. So let’s document those methods. Add this to t/lib/MyApp/Controller/Root.pm:

=head2 default

The default action. Just returns a 404/NOT FOUND error. Might want to update
later with a template to format the error like the rest of our site.

=cut

While there, I notice that the index action has a doc header, but nothing to actually describe what it does. Let’s fix that, too:

The default Catalyst action, which just displays the welcome message. This is
the "Yay it worked!" page. Consider changing to a real home page for our app.

Great. Now open t/lib/MyApp/Controller/Books.pm and document the list action:

=head2 list

Looks up all of the books in the system and executes a template to display
them in a nice table. The data includes the title, rating, and authors of each
book

=cut

Oh hey, look at that. There’s an index method that doesn’t do anything. And it has a POD header and no docs, too. So let’s document it:

The default method for the books controller. Currently just says that it
matches the request; we'll likely want to change it to something more
reasonable down the line.

Okay, so how do the tests look now?

t/01app.t ............. ok   
t/02pod.t ............. ok     
t/03podcoverage.t ..... ok   
t/controller_Books.t .. ok   
t/view_HTML.t ......... ok   
All tests successful.
Files=5, Tests=25,  3 wallclock secs ( 0.05 usr  0.02 sys +  2.82 cusr  0.31 csys =  3.20 CPU)
Result: PASS

Excellent! Now, the truth is that we didn’t document our templates, either. Test::Pod doesn’t cotton on to that fact because they’re not installed like normal subroutines in the test classes. So it’s up to us to document them ourselves. (Note to self: Consider adding a module to test that all Template::Declare classes have docs for all of their templates.) I’ll wait here while you do that.

All done? Great! I had actually planned to start testing the view next, but I think this is enough for today. Stay tuned for more testing goodness.

Looking for the comments? Try the old layout.

pgTAP Best Practices Slides Available

Last month I gave two presentations at the PostgreSQL Conference West. The first was a variation on Unit Test Your Database!, which I’ve now given about six times (and will at least two more times, including tomorrow night for Portland PLUG and in two weeks at the JPUG 10th Anniversary Conference). The second was a new talk, a 90-minute tutorial, called “pgTAP Best Practices.” And here it is, published on Slideshare. Enjoy.

You can also download a 76MB PDF if that’s more your style.

Enjoy!

Looking for the comments? Try the old layout.

Use Rubyish Blocks with Test::XPath

Thanks to the slick Devel::Declare-powered PerlX::MethodCallWithBlock created by gugod, the latest version of Test::XPath supports Ruby-style blocks. The Ruby version of assert_select, as I mentioned previously, looks like this:

assert_select "ol" { |elements|
  elements.each { |element|
    assert_select element, "li", 4
  }
}

I’ve switched to the brace syntax for greater parity with Perl. Test::XPath, meanwhile, looks like this:

my @css = qw(foo.css bar.css);
$tx->ok( '/html/head/style', sub {
    my $css = shift @css;
    shift->is( './@src', $css, "Style src should be $css");
}, 'Should have style' );

But as of Test::XPath 0.13, you can now just use PerlX::MethodCallWithBlock to pass blocks in the Rubyish way:

use PerlX::MethodCallWithBlock;
my @css = qw(foo.css bar.css);
$tx->ok( '/html/head/style', 'Should have style' ) {
    my $css = shift @css;
    shift->is( './@src', $css, "Style src should be $css");
};

Pretty slick, eh? It required a single-line change to the source code. I’m really happy with this sugar. Thanks for the great hack, gugod!

Looking for the comments? Try the old layout.

Test XML and HTML with XPath

When I was hacking Rails projects back in 2006-2007, there was a lot of stuff about Rails that drove me absolutely batshit (<cough>ActiveRecord</cough>), but there were also a (very) few things that I really liked. One of those things was the assert_select test method. There was a bunch of magic involved in sending a request to your Rails app and stuffing the body someplace hidden (hrm, that sounds kind of evil; intentional?), but then you could call assert_select to use CSS selectors to test the structure and content of the document (assuming, of course, that it was HTML or XML). For example, (to borrow from the Rails docs), if you wanted to test that a response contains two ordered lists, each with four list elements then you’d do something like this:

assert_select "ol" do |elements|
    elements.each do |element|
    assert_select element, "li", 4
    end
end

What it does is select all of the <ol> elements and pass them to the do block, where you can call assert_select on each of them. Nice, huh? You can also implicitly call assert_select on the entire array of passed elements, like so:

assert_select "ol" do
    assert_select "li", 8
end

Slick, right? I’ve always wanted to have something like this in Perl, but until last week, I didn’t really have an immediate need for it. But I’ve started on a Catalyst project with my partners at PGX, and of course I’m using a view to generate XHTML output. So I started asking around for advice on proper unit testing for Catalyst views. The answer I got was, basically, Test::WWW::Mechanize::Catalyst. But I found it insufficient:

$mech->get_ok("/");
$mech->html_lint_ok( "HTML should be valid" );
$mech->title_is( "Root", "On the root page" );
$mech->content_contains( "This is the root page", "Correct content" );

Okay, I can check the title of the document directly, which is kind of cool, but there’s no other way to examine the structure? Really? And to check the content, there’s just content_contains(), which concatenates all of the content without any tags! This is useful for certain very simple tests, but if you want to make sure that your document is properly structured, and the content is in all the right places, you’re SOL.

Furthermore, the html_link_ok() method didn’t like the Unicode characters output by my view:

#   Failed test 'HTML should be valid (http://localhost/)'
#   at t/view_TD.t line 30.
# HTML::Lint errors for http://localhost/
#  (4:3) Invalid character \x2019 should be written as &rsquo;
#  (18:5) Invalid character \xA9 should be written as &copy;
# 2 errors on the page

Of course, those characters aren’t invalid, they’re perfectly good UTF-8 characters. In some worlds, I suppose, they should be wrong, but I actually want them in my document.

So I switched to Test::XML, which uses a proper XML parser to validate a document:

ok my $res = request("http://localhost:3000/"), "Request home page";
ok $res->is_success, "Request should have succeeded";

is_well_formed_xml $res->content, "The HTML should be well-formed";

Cool, so now I know that my XHTML document is valid, it’s time to start examining the content and structure in more detail. Thinking fondly on assert_select, I went looking for a test module that uses XPath to test an XML document, and found Test::XML::XPath right in the Test::XML distribution, which looked to be just what I wanted. So I added it to my test script and added this line to test the content of the <title> tag:

is_xpath $res->content, "/html/head/title", "Welcome!";

I ran the test…and waited. It took around 20 seconds for that test to run, and then it failed!

#   Failed test at t/view_TD.t line 25.
#          got: ''
#     expected: 'Welcome!'
#   evaluating: /html/head/title
#      against: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
# <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
#  <head>
#   <title>Welcome!</title>
#  </head>
# </html>

No doubt the alert among my readership will spot the problem right away, but I was at a loss. Fortunately, Ovid was over for dinner last week, and he pointed out that it was due to the namespace. That is, the xmlns attribute of the <html> element requires that one register a namespace prefix to use in the XPath expression. He pointed me to his fork of XML::XPath, called Test::XHTML::XPath, in his Escape project. It mostly duplicates Test::XML::XPath, but contains this crucial line of code:

$xpc->registerNs( x => "http://www.w3.org/1999/xhtml" );

By registering the prefix “x” for the XHTML namespace, he’s able to write tests like this:

is_xpath $res->content, "/x:html/x:head/x:title", "Welcome!";

And that works. It seems that the XPath spec requires that one use prefixes when referring to elements within a namespace. Test::XML::XPath, alas, provides no way to register a namespace prefix.

Perhaps worse is the performance problem. I discovered that if I stripped out the DOCTYPE declaration from the XHTML before I passed it to is_xpath, the test was lightning fast. Here the issue is that XML::LibXML, used by Test::XML::XPath, is fetching the DTD from the w3.org Web site as the test runs. I can disable this by setting the no_network and recover_silently XML::LibXML options, but, again, Test::XML::XPath provides no way to do so.

Add to that the fact that Test::XML::XPath has no interface for recursive testing like assert_select and I was ready to write my own module. One could perhaps update Test::XML::XPath to be more flexible, but for the fact that it falls back on XML::XPath when it can’t find XML::LibXML, and XML::XPath, alas, behaves differently than XML::LibXML (it didn’t choke on my lack of a namespace prefix, for example). So if you ship an application that uses Test::XML::XPath, tests might fail on other systems where it would use a different XPath module than you used.

And so I have written a new test module.

Introducing Test::XPath, your Perl module for flexibly running XPath-powered tests on the content and structure of your XML and HTML documents. With this new module, the test for my Catalyst application becomes:

my $tx = Test::XPath->new( xml => $res->content, is_html => 1 );
$tx->is("/html/head/title", "Welcome", "Title should be correct" );

Notice how I didn’t need a namespace prefix there? That’s because the is_html parameter coaxes XML::LibXML into using its HTML parser instead of its XML parser. One of the side-effects of doing so is that the namespace appears to be assumed, so I can ignore it in my tests. The HTML parser doesn’t bother to fetch the DTD, either. For tests where you really need namespaces, you’d do this:

my $tx = Test::XPath->new(
    xml     => $res->content,
    xmlns   => { x => "http://www.w3.org/1999/xhtml" },
    options => { no_network => 1, recover_silently => 1 },
);
$tx->is("/x:html/x:head/x:title", "Welcome", "Title should be correct" );

Yep, you can specify XML namespace prefixes via the xmlns parameter, and pass options to XML::LibXML via the options parameter. Here I’ve shut off the network, so that XML::LibXML prevents network access, and told it to recover silently when it tries to fetch the DTD, but fails (because, you know, it can’t access the network). Not bad, eh?

Of course, the module provides the usual array of Test::More-like test methods, including ok(), is(), like() and cmp_ok(). They all work just like in Test::More, except that the first argument must be an XPath expressions. Some examples borrowed from the documentation:

$tx->ok( '//foo/bar', 'Should have bar element under foo element' );
$tx->ok( 'contains(//title, "Welcome")', 'Title should "Welcome"' );

$tx->is( '/html/head/title', 'Welcome', 'Title should be welcoming' );
$tx->isnt( '/html/head/link/@type', 'hello', 'Link type should not' );

$tx->like( '/html/head/title', qr/^Foobar Inc.: .+/, 'Title context' );
$tx->unlike( '/html/head/title', qr/Error/, 'Should be no error in title' );

$tx->cmp_ok( '/html/head/title', 'eq', 'Welcome' );
$tx->cmp_ok( '//story[1]/@id', '==', 1 );

But the real gem is the recursive testing feature of the ok() test method. By passing a code reference as the second argument, you can descend into various parts of your XML or HTML document to test things more deeply. ok() will pass if the XPath expression argument selects one or more nodes, and then it will call the code reference for each of those nodes, passing the Test::XPath object as the first argument. This is a bit different than assert_select, but I view the reduced magic as a good thing.

For example, if you wanted to test for the presence of <story> elements in your document, and to test that each such element had an incremented id attribute, you’d do something like this:

my $i = 0;
$tx->ok( '//assets/story', sub {
    shift->is('./@id', ++$i, "ID should be $i in story $i");
}, 'Should have story elements' );

For convenience, the XML::XPath object is also assigned to $_ for the duration of the call to the code reference. Either way, you can call ok() and pass code references anywhere in the hierarchy. For example, to ensure that an Atom feed has entries and that each entry has a title, a link, and a very specific author element with name, uri, and email subnodes, you can do this:

$tx->ok( '/feed/entry', sub {
    $_->ok( './title', 'Should have a title' );
    $_->ok( './author', sub {
        $_->is( './name',  'Larry Wall',       'Larry should be author' );
        $_->is( './uri',   'http://wall.org/', 'URI should be correct' );
        $_->is( './email', 'perl@example.com', 'Email should be right' );
    }, 'Should have author elements' );
}, 'Should have entry elements' );

There are a lot of core XPath functions you can use, too. For example, I’m going to write a test for every page returned by my application to make sure that I have the proper numbers of various tags:

$tx->is('count(/html)',     1, 'Should have 1 html element' );
$tx->is('count(/html/head') 1, 'Should have 1 head element' );
$tx->is('count(/html/body)  1, 'Should have 1 body element' );

I’m going to use this module to the hilt in all my tests for HTML and XML documents from here on in. The only thing I’m missing from assert_select is that it supports CSS 2 selectors, rather than XPath expressions, and the implementation offers quite a few other features including regular expression operators for matching attributes, pseudo-classes, and other fun stuff. Still, XPath gets me all that I need; the rest is just sugar, really. And with the ability to define custom XPath functions in Perl, I can live without the extra sugar.

Maybe you’ll find it useful, too.

Looking for the comments? Try the old layout.

pgTAP 0.22: Test Your Results

I’m pleased to announce that, after much thinking, committing, and not an insignificant amount of hair-pulling, pgTAP 0.22 has finally landed. Download it here. Many, many thanks to all who commented on my previous posts, made suggestions, and helped me on IRC to figure out how to get all this stuff to work. The crazy thing is that it does, quite well, all the way back to PostgreSQL 8.0.

So here’s what I’ve come up with: ten simple functions. Sound like a lot? Well, it’s simpler than it might at first appear. Here’s a quick overview:

results_eq()

Compares two queries row-for-row. Pass in strings with SQL queries, strings with prepared statement names, or cursors. If the query you’re testing returns a single column, the expected results can be passed as an array. If a test fails, you get useful diagnostics:

# Failed test 146
#     Results differ beginning at row 3:
#         have: (1,Anna)
#         want: (22,Betty)

If a row is missing, the diagnostics will show it as a NULL:

# Failed test 147
#     Results differ beginning at row 5:
#         have: (1,Anna)
#         want: NULL
results_ne()

Just like results_eq(), except that it tests that the results of the two arguments are *not* equivalent. May not be very useful, but it’s cute.

set_eq()

Tests that two queries return the same results, without regard to the order of the results or duplicates. Pass in strings with SQL queries or strings with prepared statement names. As with results_eq() the expected results can be passed as an array if the test query returns a single column. Failure diagnostics look like this:

# Failed test 146
#     Extra records:
#         (87,Jackson)
#         (1,Jacob)
#     Missing records:
#         (44,Anna)
#         (86,Angelina)

If the failure is due to incompatible column types, the diagnostics will help you out there, too:

# Failed test 147
#     Columns differ between queries:
#         have: (integer,text)
#         want: (text,integer)
set_ne()

The inverse of set_eq(), the test passes if the results of the two queries are different, without regard to order or duplicate rows. No diagnostics on failure, though; if it fails, it’s because the results are the same.

set_has()

Tests that a query contains a subset of results without regard to order or duplicates. Useful if you need to ensure that a query returns at least some set of rows. Failure diagnostics are useful again:

# Failed test 122
#     Missing records:
#         (44,Anna)
#         (86,Angelina)
set_hasnt()

Tests that a query does not contain a subset of results, without regard to order or duplicates.

bag_eq()

Just like set_eq(), except that duplicates matter. So if the first query has duplicate rows, the second must have the same dupes. Diagnostics are equally useful.

bag_ne()

Just like set_ne(), except that duplicates matter.

bag_has()

Just like set_has(), except that duplicates matter.

bag_hasnt()

Just like set_hasnt(), except that duplicates matter.

Be sure to look at my previous post for usage examples. Since I wrote it, I’ve also added the ability to pass an array as the second argument to these functions. This is specifically for the case when the query you’re testing results a single column of results; the array just makes it easier to specify expected values in a common case:

SELECT results_eq(
    'SELECT * FROM active_user_ids()',
    ARRAY[ 2, 3, 4, 5]
);

Check the documentation for all the details on how to use these functions.

I’m really happy with these functions. It was definitely worth it to really think things through, look at prior art, and spend the time to try different approaches. In the process, I’ve found an approach that works in nearly all circumstances.

The one exception is in results_eq() and results_ne() on PostgreSQL 8.3 and down. The issue there is that there were no operators to compare two record objects before PostgreSQL 8.4. So for earlier versions, the code has to cast the records representing each row to text. This means that two rows can be different but appear to be the same to 8.3 and down. In practice this should be pretty rare, but I’m glad that record comparisons are more correct in 8.4

The only other issue is performance. Although you can write your tests in SQL, rather than strings containing SQL, the set and bag functions use the PL/pgSQL EXECUTE statement to execute each SQL statement and insert it into a temporary table. Then they select the data from the temporary tables once or twice to do the comparisons. That’s a lot more processing than simply running the query itself, and it slows down the performance significantly.

Similarly, the results functions use cursors and fetch each row one-at-a-time. The nice thing is that, in the event of a failure for results_eq() or a pass for results_ne(), the functions can stop fetching results before reaching the end of the queries. But either way, a fair bit of processing goes on.

I’m not sure which is slower, the set and bag functions or the results functions, but, short of adding new syntax to SQL (not an option), I could see no other way to adequately do the comparisons and emit useful diagnostics.

But those are minor caveats, I think. I’m pretty pleased with the function names and the interfaces I’ve created for them. Please download the latest and let me know what you think.

So what’s next? Well, there are a few more schema-testing functions I’d like to add, but after that, I’d like to declare pgTAP stable and start using it in new projects. I’m thinking about writing a test suite for database normalization, starting with testing that all tables have primary keys.

But that’s after my vacation. Back in two weeks.

Looking for the comments? Try the old layout.

Committed: pgTAP Result Set Assertion Functions

Regular readers will know that I’ve been thinking a lot about testing SQL result sets and how to how to name result testing functions, and various implementation issues. I am very happy to say that I’ve now committed the first three such test functions to the Git repository. They’ve been tested on 8.4 and 8.3. Here’s what I came up with.

I had a pretty good idea how to compare sets and how to compare ordered bags, but ordered sets and unordered bags of results escaped me. During two days of intense hacking and experimentation, I quickly wrote set_eq(), which performs a set comparison of the results of two queries, and obag_eq(), which performs a row-by-row comparison of the results of two queries. I then set to work on bag_eq(), which would do a set comparison but require the same number of duplicate rows between the two queries. set_eq() was easy because I just needed to create temporary tables of the two queries and then execute two EXCEPT queries against them to see where they differ, if at all. bag_eq() was getting kind of hairy, though, so I asked about it on the Freenode #postgresql channel, where depesz looked at my example and pointed out that EXCEPT ALL would do just want I needed.

Hot damn, all it took was the addition a single extra word to the same queries used by set_eq() and I was set. This made me very happy, and such well-thought-out features are the reason I love PostgreSQL. My main man depesz made my day.

But oset_eq(), which was to compare ordered sets of results was proving much harder. The relational operators that operate on sets don’t care about order, so I would have to write the code to care myself. And because dupes needed to be ignored, it got even harder. In fact, it proved just not worth the effort. The main reason I abandoned this test function, though, was not difficulties of implementation (which were significant), but ambiguity of interpretation. After all, if duplicates are allowed but ignored, how does one deal with their effect on order? For example, say that I have two queries that order people based on name. One query might order them like so:

select * from people order by name;
  name  | age 
--------+-----
 Damian |  19
 Larry  |  53
 Tom    |  35
 Tom    |  44
 Tom    |  35

Another run of the same query could give me a different order:

select * from people order by name;
  name  | age 
--------+-----
 Damian |  19
 Larry  |  53
 Tom    |  35
 Tom    |  35
 Tom    |  44

Because I ordered only on “name,” the database was free to sort records with the same name in an undefined way. Meaning that the rows could be in different orders. This is known, if I understand correctly, as a “Partially ordered set,” or “poset.” Which is all well and good, but from my point of view makes it damn near impossible to be able to do a row-by-row comparison and ignore dupes, because they could be in different orders!

So once I gave up on that, I was down to three functions instead of four, and only one depends on ordering. So I also dropped the idea of having the “o” in the function names. Instead, I changed obag_eq() to results_eq(), and now I think I have three much more descriptive names. To summarize, the functions are:

results_eq
Compares two result sets row by row, meaning that they must be in the same order and have the same number of duplicate rows in the same places.
set_eq
Compares result sets to ensure they have the same rows, without regard to order or duplicate rows.
bag_eq
Compares result sets without regard to order, but each must have the same duplicate rows.

I’m very happy with this, because I was able to give up on the stupid function names with the word “order” included or implicit in them. Plus, I have different names for functions that are similar, which is nicely in adherence to the principle of distinction. They all provide nice diagnostics on failure, as well, like this from results_eq():

# Failed test 146
#     Results differ beginning at row 3:
#         have: (1,Anna)
#         want: (22,Betty)

Or this from set_eq() or bag_eq()

# Failed test 146
#     Extra records:
#         (87,Jackson)
#         (1,Jacob)
#     Missing records:
#         (44,Anna)
#         (86,Angelina)

set_eq() and bag_eq() also offer up useful diagnostics when the data types of the rows vary:

# Failed test 147
#     Columns differ between queries:
#         have: (integer,text)
#         want: (text,integer)

results_eq() doesn’t have access to such data, though if I can find some tuits (got any to give me?), I’ll write a quick C function that can return an array of the data types in a record object.

Now, as for the issue of arguments, what I settled on is, like Epic, passing strings of SQL to these functions. However, unlike Epic, if you pass in a simple string with no spaces, or a double-quoted string, pgTAP assumes that it’s the name of a prepared statement. The documentation now recommends prepared statements, which you can use like this:

PREPARE my_test AS SELECT * FROM active_users() WHERE name LIKE 'A%';
PREPARE expect AS SELECT * FROM users WHERE active = $1 AND name LIKE $2;
SELECT results_eq('my_test', 'expect');

This allows you to keep your SQL written as SQL, keeping your test, um, SQLish. But in those cases where you have some really simple SQL, you can just use that, too:

SELECT set_eq(
    'SELECT * FROM active_users()',
    'SELECT * FROM users ORDER BY id'
);

This feels like a good compromise to me, allowing the best of both worlds: keeping things in pure SQL to avoid quoting ugliness in SQL strings, while letting users pass in SQL strings if they really want to.

It turns out that I wasn’t able to support cursors for set_eq() or bag_eq(), because they use the statements passed to them to create temporary tables and then compare the records in those temporary tables. But results_eq() uses cursors internally. And it turns out that there’s a data type for cursors, refcursor. So it was easy to add cursor support to results_eq() for those who want to use it:

DECLARE cwant CURSOR FOR SELECT * FROM active_users();
DECLARE chave CURSOR FOR SELECT * FROM users WHERE active ORDER BY name;
SELECT results_eq('cwant'::refcursor, 'chave'::refcursor );

Neat, huh? As I said, I’m very pleased with this approach overall. There are a few caveats, such as less strict comparisons in results_eq() on 8.3 and lower, and less useful diagnostics for data type differences in results_eq(), but overall, I think that the implementation is pretty good, and that these functions will be really useful.

So what do you think? Please clone the Git repository and take the functions for a test drive on 8.3 or 8.4. Let me know what you think!

In the meantime, before releasing a new version, I still plan to add:

  • set_includes() - Set includes records in another set.
  • set_excludes() - Set excludes records in another set.
  • bag_includes() - Bag includes records in another bag.
  • bag_excludes() - Bag excludes records in another bag.
  • col_eq() - Single column result set equivalent to an array of values.
  • row_eq() - Single row form a query equivalent to a record.
  • rowtype_is() - The data type of the rows in a query is equivalent to an array of types.

Hopefully I can find some time to work on those next week. The only challenging one is row_eq(), so I may skip that one for now.

Looking for the comments? Try the old layout.

More about…

pgTAP Set-Testing Update

I’ve been thinking more about testing SQL result sets and how to name functions that do such testing, and I’ve started to come to some conclusions. Some of the constraints I’m looking at:

  • Cursors are required for tests where the order of the results returned is important. It might be best for such functions to create the cursors themselves.

  • For comparisons where order isn’t important, the results of each SQL statement must be inserted into a temporary table and then the table used for the comparisons. Otherwise, each statement would be executed twice, as is required to calculate symmetric difference. By executing each once and storing the results in a temporary table, we get around this issue (and indeed, this one of the cases that Epic’s global() function addresses).

  • When order is not important, the most efficient way to compare result sets is with symmetric difference. However, said comparison is a set comparison, meaning that duplicate rows are ignored. So if set A has 3 rows and set B has four, but two of those four are identical, then sets A and B can still be equivalent.

I’m starting to think that I would have two basic result set testing functions, set_eq() and bag_eq(). The former would do a set comparison, while the latter would require that duplicate rows be present in both result sets. Unfortunately, that would mean that it would be difficult for set_eq to have a variation that tests ordered sets, as symmetric difference ignores relational ordering. And a simple bag_eq() function would require that the relations be ordered, as it would iterate over each row in each relation in turn and compare row to row. But as I pointed out to commenter “@dave0,” bags are not inherently ordered, so it would be imposing a requirement that’s not necessarily appropriate.

This is starting to drive me a bit nuts.

I think that there are ways to enforce an ordered comparison on a set and an unordered comparison on a bag, but both would be pretty inefficient. Maybe I should do it anyway, include the appropriate caveats in the documentation, and then improve when feasible in the future. In that case, what I’d be looking at is something like this:

set_eq( sql, sql )
Test for set equivalence of two SQL statements.
oset_eq( sql, sql )
Test for ordered set equivalence of two SQL statements.
bag_eq( sql, sql )
Test for row equivalence of two SQL statements.
obag_eq( sql, sql )
Test for ordered row equivalence of two SQL statements.

The preferred tests would be set_eq() and obag_eq(). If a single word is passed to any of these functions, it’s assumed to be a prepared statement. Cursors would be created internally for the functions that require ordered comparison. The non-ordered versions would create temporary tables to hold the values and then use those tables for the comparisons. bag_eq() would also construct cursors on the temporary tables to ensure that rows could be compared in the same order in which they were generated by the SQL statement.

Interface-wise, perhaps a boolean would be preferred to indicate whether or not to compare the rows in an ordered fashion? That would be:

set_eq( sql, sql, bool )
Test for set equivalence of two SQL statements. The sets must be in the same order if the boolean argument is true.
bag_eq( sql, sql, bool )
Test for row equivalence of two SQL statements. The bags must have their rows in the same order if the boolean argument is true.

I like that there are fewer functions this way, but is it harder to remember what the boolean is for? (It would not be required, and would default to false for both functions). Thoughts?

By the way, I would likely throw in a couple of other resultset-comparing functions:

set_includes( sql, sql )
Test that the set returned by the first statement includes the rows returned by the second statement.
set_excludes( sql, sql )
Test that the set returned by the first statement excludes the rows returned by the second statement.
bag_includes( sql, sql )
Test that the bag returned by the first statement includes the rows returned by the second statement, including duplicates.
bag_excludes( sql, sql )
Test that the bag returned by the first statement excludes the rows returned by the second statement, including duplicates.

Seem useful? Please leave a comment with your thoughts.

Looking for the comments? Try the old layout.

Need Help Naming Result Set Testing Functions

I’ve been thinking more since I posted about testing SQL result sets, and I think I’ve settled on two sets of functions for pgTAP: one that tests two SQL queries (though you will be encouraged to use a prepared statement), and one to test two cursors. I’m thinking of naming them:

  • query_gets()
  • cursor_gets()

I had been planning on *_returns() or *_yields(), but they didn’t feel right. “Returns” implies that I would be passing a query and a data structure (to me at least), and while I want to support that, too, it’s not what I was looking for right now. “Yield,” on the other hand, is more related to set-returning functions in my mind (even if PL/pgSQL doesn’t use that term). Anyway, I like the use of “gets” because it’s short and pretty unambiguous.

These function will compare query results as unordered sets, but I want variants that test ordered sets, as well. I’ve been struggling to come up with a decent name for these variants, but not liking any very well. The obvious ones are:

  • ordered_query_gets()
  • ordered_cursor_gets()

And:

  • sorted_query_gets()
  • sorted_cursor_gets()

But these are kind of long for functions that will be, I believe, used frequently. I could just add a character to get the same idea, in the spirit of sprintf:

  • oquery_gets()
  • ocursor_gets()

Or:

  • squery_gets()
  • scursor_gets()

I think that these are okay, but might be somewhat confusing. I think that the “s” variant probably won’t fly, since for sprintf and friends, the “s” stands for “string.” So I’m leaning towards the “o” variants.

But I’m throwing it out there for the masses to make suggestions: Got any ideas for better function names? Are there some relational terms for ordered sets, for example, that might make more sense? What do you think?

As a side note, I’m also considering:

  • col_is() to compare the result of a single column query to an array or other query. This would need an ordered variant, as well.
  • row_is(), although I have no idea how I’d be able to support passing a row expression to a function, since PostgreSQL doesn’t allow RECORDs to be passed to functions.

Looking for the comments? Try the old layout.

pgTAP 0.21 Drops

Mmmmm…pgTAP

I just dropped a new version of pgTAP following a few marathon hack sessions since my talk at PGCon (movie here, BTW). Actually, the new performs_ok() function came about as I wrote the presentation, but all the rest came on the flight home and in the few days since. Not sure when I’ll hack on it this much again (it’s getting a bit big at 5,758 lines of PL/pgSQL and SQL).

Overall I’m very happy with this release, as it adds a lot of new assertion functions. In particular, I added a slew of functions that test that the objects in a given schema (or visible in the search path, if you prefer) are exactly the objects that should be there. This is useful for a couple of things. For one, Norman Yamada, in his PGCon talk, mentioned that his team was using pgTAP to compare database objects between replicated databases. I like this because it’s a great example of using pgTAP for system testing, rather than just unit testing as I’ve been advocating. See, pgTAP can be used for any kind of testing!

Another use for these functions is in a large organization where many different people might be making changes to a schema. In this scenario, you might have application developers adding new objects to the database (or dropping objects) without necessarily informing the DBAs. Using, for example, tables_are() and functions_are() and continuous testing, the DBAs can see when objects have been modified by the developers. Even better, if the developers are running the pgTAP tests themselves (as they should be!), they will be reminded to add new tests for their changes when the existing tests notice that things have been added or dropped and thus fail.

Beyond that, I added a bunch of new functions for testing functions and a number of other goodies. Check out the release notes for all the details.

With these changes, I’ve finished nearly everything I’ve thought of for pgTAP. There are only a few sequence-testing functions left on the To Do list, as well as a call to add a throws_like() function, which I’ll throw in soon if no one else steps up. Beyond these changes, I have a few ideas of where to take it next, but so far I’m kind of stumped. Mainly what I think should be done is to add an interface that makes it easier to compare relations (or result sets, if you prefer). Epic does this by allowing query strings to be passed to a function, but I’d really like to keep queries in SQL rather than in SQL strings. I’ll be giving it some more thought and will post about it soon.

Looking for the comments? Try the old layout.

Unit Test Your Database!

Gave my talk at PGCon today. I felt that it went well, and was well-received. So here it is for everyone else, for posterity, thanks to SlideShare. Enjoy!

Looking for the comments? Try the old layout.

More on Database Testing

I’ve been meaning for a while to come back to the topic of database testing. After posting that entry, I thought more about the quote from a PostgreSQL core hacker, which I think bears repeating:

Well, you are testing for bugs, and bugs are pretty specific in where they appear. Writing the tests is 90% of the job; writing the infrastructure is minor. If the infrastructure has limitations, which all do, you might as well write that extra 10% too.

I had been so focused on the first sentence, on the why of database testing, that I’d not rally addressed the rest. I failed to notice that he was questioning the utility of a testing infrastructure, or what I would call a framework, like pgTAP. So let me rectify that right now by addressing his actual point.

The idea of using an established framework and protocol is to be able to focus exclusively on the task of writing tests, rather than worrying about how to analyze test results. I agree that writing tests can be time-consuming, but that doesn’t mean that one should write one’s own testing framework. The great thing about pgTAP is that it emits TAP, which can then be analyzed along with any other TAP-emitting test framework in any environment, including Perl, C, JavaScript, PHP, and even Oracle, among others.

The other argument that might support writing one’s own testing infrastructure is if it’s too hard to apply one style of testing to a given application. For example, most of the existing TAP producers provide a functional interface to writing tests. SQL, on the other hand, is not a functional language. So–leaving aside for the moment that one can provide an effective functional interface for writing database tests–even if one wanted to write a relational-style testing framework, it could still emit TAP! TAP is, after all, just a stream of text. So as long as a SQL SELECT statement returns a stream of TAP, then you can take advantage of the myriad of test analysis tools out there.

Now, I was discussing the use of TAP with a different PostgreSQL contributor, who was asking me about modifying the output of pg_regress to be TAP. The way that pg_regress works–and therefore how PostgreSQL core tests work–is simple: One writes SQL statements into a test script, and then one writes an expected output file. If the output of the tests might vary by platform, database setting, or compile-time feature, one just creates more expected files, each with the appropriate variations.

The PostgreSQL test runner, pg_regress then simply runs the script through psql and diffs the output against each expected file. If one of the files is identical to the output, the test passes. Otherwise it fails. When the tests run, the output looks like this:

parallel group (2 tests):  copyselect copy
   copy                 ... ok
   copyselect           ... ok

My core hacker correspondent was thinking of modifying this output to be TAP, something like this:

# Parallel group (2 tests):  copyselect copy
1..2
ok 1 - copy
ok 2 - copyselect

With this change, he could then run the regression tests through TAP::Harness in a cron job and send failure reports when a test failed. This is good as far as it goes, but it has a couple of significant limitations. For one, there are no diagnostics if something goes wrong. This is because, and this is the second shortcoming, it just turns the result of testing a single script into TAP, not individual assertions. There might be 1000s of SQL statements in one script, but if the test fails, one won’t know what failed until one looks at regression.diff.

One of the great features of TAP is the support for diagnostics. For example, if an assertion fails, you might see output something like this:

not ok 38 - The frobnitz should be named "foo"
# Failed test 38: "The frobnitz should be named "foo""
#         have: NULL
#         want: foo

Just changing the listing of the test scripts run does not get you this advantage. That’s not to say that it doesn’t make certain things easier, or that one couldn’t simply have shorter test scripts in order to limit the scope of what’s being tested and what’s a result. But a single test script does not make for a good assertion. In short, pg_regress tests don’t do assertions at all. They simply compare actual and expected output from very verbose scripts. This is a hell of a lot better than nothing, but is still quite limited.

I suggested to my correspondent that he consider modifying the tests he was working on to use pgTAP, instead. Of course, if you have a lot of existing tests, it might be more trouble than it’s worth to rewrite them all. But that doesn’t mean that you can’t write new tests going forward using something that’s more granular, and gives you a lot more control over the output.

His answer rather shocked me:

I’m lucky enough to only be dealing with really good developers, so I can produce software that works even without focusing specifically on low-level unit tests.

To me, it’s a misconception to think “really good developers” don’t need tests. As I said in reply, I consider myself a “really good developer,” and I’d have a whole lot more pain with the code I maintain if it weren’t for the tests I’ve written. Tests do not compensate for poor coders. Rather, they make it easier to maintain, modify, and refactor code, to fix bugs, and to add new features. Not to mention testing my code on new versions of software, such as testing my CPAN modules with Perl 5.10 and testing my client’s PostgreSQL databases with 8.3 or 8.4 as they look to upgrade from 8.2.

One place where my correspondent agreed that pg_regress could use improvement is in the whole approach to matching different outputs. Using pgTAP, one can write tests in such a way that there are different expected results depending on database settings right in the test scripts themselves! This is because the TAP output can vary in any number of ways, and the harness doesn’t care, as long as the tests pass. pg_regress is extremely strict about what it considers a passing test, and this lacks the necessary flexibility for testing some advanced features.

This is why there are no tests for multibyte character semantics (such as locale-aware sorting) in the PostgreSQL core tests.

Are you a database testing doubter? Have I convinced you yet that a rock-solid test suite can actually make your job easier and more enjoyable? Perhaps I never will, but I am still very interested in your reasons for doubting the utility of database testing. What other points should I be thinking about as I prepare for my PGCon presentation?

Looking for the comments? Try the old layout.

pgTAP 0.20 Infiltrates Community

I did all I could to stop it, but it just wasn’t possible. pgTAP 0.20 has somehow made its way from my Subversion server and infiltrated the PostgreSQL community. Can nothing be done to stop this menace? Its use leads to cleaner, more stable, and more-safely refactored code. This insanity must be stopped! Please review the following list of its added vileness since 0.19 to determine how you can stop the terrible, terrible influence on your PostgreSQL unit-testing practices that is pgTAP:

  • Changed the names of the functions tested in sql/do_tap.sql and sql/runtests.sql so that they are less likely to be ordered differently given varying collation orders provided in different locales and by different vendors. Reported by Ingmar Brouns.
  • Added the --formatter and --archive options to pg_prove.
  • Fixed the typos in pg_prove where the output of --help listed --test-match and --test-schema instead of --match and --schema.
  • Added has_cast(), hasnt_cast(), and cast_context_is().
  • Fixed a borked function signature in has_trigger().
  • Added has_operator(), has_leftop(), and has_rightop().
  • Fixed a bug where the order of columns found for multicolum indexes by has_index() could be wrong. Reported by Jeff Wartes. Thanks to Andrew Gierth for help fixing the query.

Don’t make the same mistake I did, where I wrote a lot of pgTAP tests for a client, and now testing database upgrades from 8.2 to 8.3 is just too reliable! YOU HAVE BEEN WARNED.

Good luck with your mission.

Looking for the comments? Try the old layout.