Benchmarking PostgreSQL Functions
Update 2006-05-19: I realized that there was a nasty error in my
algorithm for determining the runtime of a function: It was only fetching the
milliseconds part of the runtime, without adding in seconds and minutes! This
led to getting negative runtimes then the milliseconds part of the end time was
less than the millseconds part of the start time. Ugh. But with the help of
yain on IRC, I’ve switched to calculating the number of seconds by
converting the start and end times to epoch seconds (which have subsecond
precision in PostgreSQL, and now things are just dandy. While I was at it, I
reorganized the function so that it was a bit easier to read, by constructing
the created function in the order it would be executed, and fixed the caching
problem, as suggested by Aidan in a comment below.
Following yesterday’s post, Klint Gore sent me some PL/pgSQL code that might be useable as a benchmark function. Today I took that code and ran with it.
The idea was to create a function like the
Perl Benchmark
module’s timethese() function. In the process, I found, with help
from Josh Berkus, that
PL/pgSQL’s EXECUTE statement has quite a lot of overhead, and the
amount of overhead per call is pretty random. The overhead resulted in pretty
inaccurate benchmark numbers, unfortunately.
At Josh’s suggestion, I rewrote the function to just test each function inline, rather than passing the function code as parameters. This time, the results were dead on. So then I refactored the original benchmark function to create its own benchmark function, inlining all of the code, and then call that function. Almost higher order PL/pgSQL! Again the results were just right, and so now I present it to you:
create type _benchmark as (
code text,
runtime real,
corrected real
);
CREATE OR REPLACE FUNCTION benchmark(n INTEGER, funcs TEXT[])
RETURNS SETOF _benchmark AS $$
DECLARE
code TEXT := '';
a _benchmark;
BEGIN
-- Start building the custom benchmarking function.
code := $_$
CREATE OR REPLACE FUNCTION _bench(n INTEGER)
RETURNS SETOF _benchmark AS $__$
DECLARE
s TIMESTAMP;
e TIMESTAMP;
a RECORD;
d numeric;
res numeric;
ret _benchmark;
BEGIN
-- Create control.
s := timeofday();
FOR a IN SELECT TRUE FROM generate_series( 1, $_$ || n || $_$ )
LOOP
END LOOP;
e := timeofday();
d := extract(epoch from e) - extract(epoch from s);
ret := ROW( '[Control]', d, 0 );
RETURN NEXT ret;
$_$;
-- Append the code to bench each function call.
FOR i IN array_lower(funcs,1) .. array_upper(funcs, 1) LOOP
code := code || '
s := timeofday();
FOR a IN SELECT ' || funcs[i] || ' FROM generate_series( 1, '
|| n || $__$ ) LOOP
END LOOP;
e := timeofday();
res := extract(epoch from e) - extract(epoch from s);
ret := ROW(
$__$ || quote_literal(funcs[i]) || $__$,
res,
res - d
);
RETURN NEXT ret;
$__$;
END LOOP;
-- Create the function.
execute code || $_$
END;
$__$ language plpgsql;
$_$;
-- Now execute the function.
FOR a IN EXECUTE 'SELECT * FROM _bench(' || n || ')' LOOP
RETURN NEXT a;
END LOOP;
-- Drop the function.
DROP FUNCTION _bench(integer);
RETURN;
END;
$$ language 'plpgsql';
You call the function like this:
try=# select * from benchmark(10000, ARRAY[
try(# 'ean_substr(''036000291452'')',
try(# 'ean_byte(''036000291452'')',
try(# 'ean_c(''036000291452'')'
try(# ]);
code | runtime | corrected
----------------------------+-----------+-----------
[Control] | 0.0237451 | 0
ean_substr('036000291452') | 0.497734 | 0.473989
ean_byte( '036000291452') | 0.394456 | 0.370711
ean_c( '036000291452') | 0.0277281 | 0.00398302
(4 rows)
Pretty slick, eh? The only downside was that, when the DROP
FUNCTION line was not commented out, the function would run
once, and then, the next time, I’d get this error:
ERROR: cache lookup failed for function 17323 CONTEXT: PL/pgSQL function "benchmark" line 49 at for over select rows
I have no idea why. So I just leave the function and let the
CREATE OR REPLACE take care of it.













Comments & Trackbacks
Aidan wrote:
Theory wrote:
Discussion is now closed.