<?xml version="1.0" encoding="utf-8"?>

<rdf:RDF 
  xmlns="http://purl.org/rss/1.0/"
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns:cc="http://web.resource.org/cc/"
  xmlns:admin="http://webns.net/mvcb/"
  xmlns:content="http://purl.org/rss/1.0/modules/content/"
  xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/"
> 

  <channel rdf:about="http://justatheory.com">
    <title>Just a Theory</title>
    <link>http://justatheory.com</link>
    <description>Theory waxes practical. By David Wheeler.</description>
    <language>en-us</language>
    <dc:creator>David Wheeler (david@justatheory.com)</dc:creator>
    <dc:rights>Copyright David Wheeler</dc:rights>
    <cc:license rdf:resource="http://creativecommons.org/licenses/by-nc/2.0" />
    <admin:generatorAgent rdf:resource="http://www.raelity.org/apps/blosxom/?v=2.0" />
    <admin:errorReportsTo rdf:resource="mailto:david@justatheory.com"/>

    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://justatheory.com/bricolage/towards_2.0.html" />
        <rdf:li rdf:resource="http://justatheory.com/computers/vcs/svn/modify_author.html" />
        <rdf:li rdf:resource="http://justatheory.com/computers/programming/perl/modules/svnnotify-2.70.html" />
        <rdf:li rdf:resource="http://justatheory.com/computers/os/macosx/cd_rom_filesystems.html" />
        <rdf:li rdf:resource="http://justatheory.com/computers/internet/mail/need_imap_solution.html" />
        <rdf:li rdf:resource="http://justatheory.com/computers/databases/postgresql/recurring_events.html" />
        <rdf:li rdf:resource="http://justatheory.com/computers/internet/weblogs/teasers_atom_feed.html" />
        <rdf:li rdf:resource="http://justatheory.com/computers/programming/ruby/postgres_gem_on_leopard.html" />
        <rdf:li rdf:resource="http://justatheory.com/autobiographical/im_back.html" />
        <rdf:li rdf:resource="http://justatheory.com/computers/databases/postgresql/timezone_validation.html" />
        <rdf:li rdf:resource="http://justatheory.com/computers/databases/postgresql/reducing_view_calculations.html" />
        <rdf:li rdf:resource="http://justatheory.com/computers/os/macosx/ical_invite_file_location.html" />
        <rdf:li rdf:resource="http://justatheory.com/computers/programming/ruby/time_zone_bug.html" />
        <rdf:li rdf:resource="http://justatheory.com/computers/internet/weblogs/new_comment_policy.html" />
        <rdf:li rdf:resource="http://justatheory.com/computers/programming/ruby/array_to_hash_one_liner.html" />

      </rdf:Seq>
    </items>


    <image rdf:resource="http://meerkat.oreillynet.com/icons/meerkat-powered.jpg" />

  </channel>

  <image rdf:about="http://justatheory.com/logo.gif">
    <title>Just a Theory</title>
    <url>http://justatheory.com/logo.gif</url>
    <link>http://justatheory.com</link>
  </image>

  <item rdf:about="http://justatheory.com/bricolage/towards_2.0.html">
    <title>Moving Towards Bricolage 2.0</title>
    <link>http://justatheory.com/bricolage/towards_2.0.html</link>
    <description></description>
    <dc:subject>/bricolage</dc:subject>
    <dc:creator>David Wheeler</dc:creator>
    <dc:date>2008-04-30T18:03-08:00</dc:date>
    
    <cc:license rdf:resource="http://creativecommons.org/licenses/by-nc/2.0" />
    <content:encoded><![CDATA[<p>Today I&#x2019;ve finished just about over two and a half weeks of hacking on
Bricolage. It has been a couple of years since I gave it much attention, but
there was so much good stuff that other people have contributed that, since I
had a little time, it seemed worth it to give it some love. So here&#x2019;s a quick
list of all that I&#x2019;ve done in the last two weeks:</p>

<ul>
  <li><p>Fixed all reported issues with Bricolage 1.10. <a href="http://use.perl.org/~slanning/" title="Scott Lanning on use Perl">Scott Lanning</a> kindly <a href="http://perlmonks.org/?node_id=683442" title="Bricolage 1.10.5 released, 1.11 imminent">released 1.10.5</a> yesterday with all of those fixes.</p></li>
  <li><p>I integrated the element occurrence branch that <a href="http://www.haz.ca/" title="">Christian Muise</a> had worked on as his <a href="http://code.google.com/soc/2006/bric/appinfo.html?csaid=BF87FD1CE9FF0758" title="Occurrence Specification">2006 Google Summer of Code project</a>. Christian&#x2019;s project added support for maximum and mininum specfications for subelements in Bricolage, which allows administrators to define how many fields and elements can occur in a story or media document. All I had to do was add a few UI tweaks to support the new fields and their specification in the story profile, and all was ready to go. Oh, and I did have to go back and make the SOAP interface work with the feature, but the only reason it never did was lazy hacking of the SOAP interface (way before Christian&#x2019;s time). Nice work, Christian, and thank you for your contribution!</p></li>
  <li><p>I fixed a few bugs with Arsu Andrei&#x2019;s port of Bricolage to MySQL, which was his <a href="http://code.google.com/soc/2006/bric/appinfo.html?csaid=61E07C2D23D20FEC" title="Database porting SOC Proposal">2006 Google Summer of Code project</a>. Arsu did a terrific job with the port, with only a few minor things missed that he likely could not have caught anyway. This work had already been merged into the trunk. Thanks Arsu!</p></li>
  <li><p>I fixed a bunch of bugs from <a href="http://mroch.com/" title="Marshall Roch">Marshall Roch</a>&#x2019;s AJAXification of Bricolage, carried out during his <a href="http://code.google.com/soc/2006/bric/appinfo.html?csaid=934CEE0CC330C22A" title="AJAX element editing SOC proposal">2006 Google Summer of Code project</a>. Marshall actually did a lot more stuff than he&#x2019;d planned, as it all went quite smoothly. I found only a few minor oversights that I was able to eaily address. This work represents the single most visible change to how users user Bricolage since we launched the probject back in 2001. Editing stories, in particular, is now a <em>lot</em> cleaner, with far fewer page loads. Thanks a million, Marshall!</p></li>
  <li><p>I completed the work started by Chris Heiland of the University of Washington, Bothell, and <a href="http://use.perl.org/~slanning/" title="Scott Lanning on use Perl">Scott Lanning</a> of the World Health Organization to port Bricolage to Apache 2. They really did most of the hard work, and I just spent several days integrating everything, making sure all the features work, and updating the installer to handle differences in configuration. I thought this would take me a day or two, but it actually took the better part of a week! So much has changed, but in truth Bricolage is now better for running on mod_perl 2. Expect to see Apache 2 bet the recommended platform for Bricolage in a release in the near future.</p></li>
  <li><p>I integrated a number of patches from Brian Smith of <a href="http://www.gossamer-threads.com/" title="Gossamer Threads: Creative Web Engineering">Gossamer Threads</a> to allow the installer to be run as a non-root user. The key here is if the installer has to become the database super user, which is required for <a href="http://www.depesz.com/index.php/2007/10/04/ident/" title="depesz: “FATAL: Ident authentication failed”, or how cool ideas get bad usage schemas">ident authentication</a>, and of course whether files are to be installed somewhere on the system requiring super user access. This work is not done, yet, as <code>make upgrade</code> and <code>make uninstall</code> are not quite there yet. But we&#x2019;re getting there, and it should be all done in time for 2.0, thanks to Brian.</p></li>
  <li><p>I added support for a whole slew of environment variables to the installer. Now you can set environment variables to override default settings for installation parameters, such as choice of RDBMS, Apache, location of an SSL cert and key, whether to support SLL, and lots of other stuff, besides. This is all documented in the <q>Quick Installation Instructions</q> section of <a href="http://www.bricolage.cc/docs/devel/api/?Bric::Admin" title="Bric::Admin documentation">Bric::Admin/<em>INSTALL</em></a>.</p></li>
  <li><p>I fully tested and fixed a lot of bugs leftover from making the installer database- and Apache-neutral. Now all of these commands should work perfectly:</p>
    <ul>
      <li>make</li>
      <li>make cpan</li>
      <li>make test</li>
      <li>make install</li>
      <li>make devtest</li>
      <li>make clone</li>
      <li>make uninstall</li>
    </ul>
  </li>
  <li><p>I improved the DHTML functionality of the <q>Add More</q> widget, which is used to add contact information to users and contributors, rules to alert types, and extensions to media types. I think it&#x2019;s pretty slick, now! This was built on Marshall&#x2019;s AJAX work.</p></li>
</ul>

<p>All of these changes have been integrated into the Bricolage <a href="http://svn.bricolage.cc/bricolage/trunk/" title="Bricolage in Subversion">trunk</a> and I&#x2019;ve pushed out a <a href="http://bricolage.cc/news/announce/2008/04/30/bricolage-1.11.0/" title="Bricolage-Devel 1.11.0 Released">developer release</a> today. Please do check out all the goodness on a test box and <a href="http://bricolage.cc/support/lists" title="The Bricolage Mail Lists">send feedback</a> or <a href="http://bugs.bricolage.cc/" title="Bricolage Bug Tracker">file bug reports</a>! There are only a couple of other features waiting to go into Bricolage before we start the release candidate process. And, oh yeah, tht title of this blog post? It&#x2019;s not a lie. The next production release of Bricolage, based on all this work, will be Bricolage 2.0. Enough of the features we&#x2019;d planned for Bricolage lo these many years ago are in the trunk that the new version number is warranted. I for one will be thrilled to see 2.0 ship in time for <a href="http://en.oreilly.com/oscon2008/public/content/home" title="OSCON 2008">OSCON</a>.</p>

<p>And in case it isn&#x2019;t already clear, <em>many</em> thanks to the <a href="http://code.google.com/soc/2006/" title="Google Summer of Code 2006">Google Summer of Code</a> and participating students for the great contributions! This release would not have been possible without them.</p>

<p>Also in the news today, the Bricolage server has been replaced! The new server, which hosts the <a href="http://www.bricolage.cc/" title="The Bricolage Web site">Web site</a>, the <a href="http://wiki.bricolage.cc/" title="The Bricolage Wiki">wiki</a> and the instance of Bricolage used to manage the site itself, is graciously provided by the kind folks at <a href="http://www.gossamer-threads.com/" title="Gossamer Threads: Creative Web Engineering">Gossamer Threads</a>. The server is Gossamer Threads&#x2019;s way of giving back to the Bricolage community as they prepare to launch a hosted Bricolage solution. Thaks GT!</p>

<p>The old Bricolage server was provided by <a href="http://pair.net" title="pair Networks">pair Networds</a> for the last five years. I&#x2019;d just like to thank pair for the generous five-year loan of that box, which helped provided infrastructure for both Bricolage <em>and</em> Kineticode. Thank you, pair!</p>

<p>And with that, I&#x2019;m going heads-down on some other projects. I&#x2019;ll pop back up to make sure that Bricolage 2.0 is released in a few months, but otherwise, I&#x2019;m on to other things again for a while. Watch this space for details!</p>
]]></content:encoded>
  </item>

  <item rdf:about="http://justatheory.com/computers/vcs/svn/modify_author.html">
    <title>How to Globally Change a Subversion Username</title>
    <link>http://justatheory.com/computers/vcs/svn/modify_author.html</link>
    <description></description>
    <dc:subject>/computers/vcs/svn</dc:subject>
    <dc:creator>David Wheeler</dc:creator>
    <dc:date>2008-04-10T20:33-08:00</dc:date>
    
    <cc:license rdf:resource="http://creativecommons.org/licenses/by-nc/2.0" />
    <content:encoded><![CDATA[<p>I successfully migrated the Kineticode <a href="https://svn.kineticode.com/" title="The Kineticode Subversion repository">Subversion repository</a> to a new server yesterday. Everything works great. But after my first commit, I realized that, while my username on the old server was <q>theory,</q> on the new server it&#x2019;s <q>david</q>. Subversion works fine, of course, and I was able to start committing from old checkouts using the new username, but I realized that sites like <a href="http://www.ohloh.net/" title="ohloh, the open source network">Ohloh</a> would pick up the two usernames as separate usernames. So I wanted to update all of the 3630 existing revisions that were mine to use the new username.</p>

<p>Unfortunately, I couldn&#x2019;t find much on how to do this in a quick Googling. But I quickly figured out that what I need to do was to <code>svnadmin dump</code> my repository, modify the dump, and then load it again. The Subversion dump format has all these fields for tracking the content-lengths of various, so doing the update was a bit tricky. But I wrote the script here to track things, and it worked great for me. So here it is for others to reference and use.</p>

<pre>
#!/usr/bin/perl -w

use strict;
use warnings;

while (&lt;&gt;) {
    print;
    next unless /^Revision-number:\s+\d+$/;

    # Grab the content lengths. Examples:
    # Prop-content-length: 139
    # Content-length: 139
    my $plen_line = &lt;&gt;;
    my $clen_line = &lt;&gt;;

    unless ( $plen_line =~ /^Prop-content-length:\s+\d+$/ ) {
        # Nothing we want to change.
        print $plen_line, $clen_line;
        next;
    }

    my @lines;
    while ( &lt;&gt; ) {
        if ( /^PROPS-END$/ ) {
            # finish.
            print $plen_line, $clen_line, @lines, $_;
            last;
        }

        push @lines, $_;

        if ( /^svn:author$/ ) {
            # Grab the author content length. Example:
            # V 6
            my $alen_line = &lt;&gt;;

            # Grab the author name.
            my $auth = &lt;&gt;;

            if ( $auth =~ s/^theory$/david/ ) {
                # Adjust the content lengths.
                for my $line ( $plen_line, $clen_line, $alen_line ) {
                    $line =~ s/(\d+)$/$1 - 1/e;
                }
            }
            print $plen_line, $clen_line, @lines, $alen_line, $auth;
            last;
        }
    }
}
</pre>

<p>To use it, save it to a file, say <em>svn_author</em>, then change line 40 to your old and new usernames. Then, on line 43, change the <code>$1 - 1</code> bit to be correct for the difference between the usernames you&#x2019;re changing. For example, if you&#x2019;re changing your username from, say, <q>shane</q> to <q>chromatic,</q> the new name is five characters longer, so you&#x2019;d make it <code>$1 + 5</code>.</p>

<p>Now, run it like so:</p>

<pre>
svnadmin dump /path/to/svnroot &gt; svndump.out
perl svn_author svndump.out &gt; svndump.in
svnadmin create /path/to/new/svnroot
svnadmin load /path/to/new/svnroot &lt; svndump.in
</pre>

<p>And that&#x2019;s it! Feel free to take this code and do with it what you like, including fix any bugs, add command-line options, support changing multiple authors at once, or whatever. Share and enjoy.</p>
]]></content:encoded>
  </item>

  <item rdf:about="http://justatheory.com/computers/programming/perl/modules/svnnotify-2.70.html">
    <title>SVN::Notify 2.70: Output Filtering and Character Encoding</title>
    <link>http://justatheory.com/computers/programming/perl/modules/svnnotify-2.70.html</link>
    <description></description>
    <dc:subject>/computers/programming/perl/modules</dc:subject>
    <dc:creator>David Wheeler</dc:creator>
    <dc:date>2008-02-29T17:36-08:00</dc:date>
    
    <cc:license rdf:resource="http://creativecommons.org/licenses/by-nc/2.0" />
    <content:encoded><![CDATA[<p>I&#x2019;m very pleased to announce the release of <a href="http://search.cpan.org/dist/SVN-Notify/" title="SVN::Notify on CPAN">SVN::Notify</a> 2.70. You can see an example of its colordiff output <a href="/computers/programming/perl/modules/svnnotify-2.70_colordiff_example.html" title="Example output from SVN::Notify::HTML::ColorDiff 2.70">here</a>. This is a major release that I&#x2019;ve spent the last several weeks polishing and tweaking to get just right. There are quite a few <a href="http://search.cpan.org/src/DWHEELER/SVN-Notify-2.70/Changes" title="SVN::Notify Changes">changes</a>, but the two most important are imporoved character encoding support and output filtering.</p>

<h3>Improved Character Encoding Support</h3>

<p>I&#x2019;ve had a number of bug reports regarding issues with character encodings. Particularly for folks working in Europe and Asia, but really for <em>anyone</em> using multibyte characters in their source code and log messages (and we all do nowadays, don&#x2019;t we?), it has been difficult to find the proper incantation to get SVN::Notify to convert data from and to their proper encodings. Using a patch from Toshikazu Kinkoh as a starting-point, and with a lot of reading and experimentation, as well as regular and patient tests on Toshikazu&#x2019;s and Martin Lindhe&#x2019;s production systems, I think I&#x2019;ve finally got it nailed down.</p>

<p>Now you can use the <code>&#x002d;&#x002d;encoding</code> (formerly <code>&#x002d;&#x002d;charset</code>), <code>&#x002d;&#x002d;svn-encoding</code>, and <code>&#x002d;&#x002d;diff-encoding</code> options—as well as <code>&#x002d;&#x002d;language</code>—to get SVN::Notify to do the right thing. As long as your Subversion server&#x2019;s OS supports an appropriate locale, you should be golden (mine is old, with no UTF-8 locales :\). And if all else fails, you can still set the <code>$LANG</code> environment variable before executing <code>svnnotify</code>.</p>

<p>There is actually a fair bit to know about encodings to get it to work properly, but if you use UTF-8 throughout and your OS supports UTF-8 locales, you shouldn&#x2019;t have to do anything. You might have to set <code>&#x002d;&#x002d;language</code> in order to get it to use the proper locale. See the new <a href="http://search.cpan.org/dist/SVN-Notify/lib/SVN/Notify.pm#Character_Encoding_Support" title="Character Encoding Support in SVN::Notify">documentation of the encoding support</a> for all the details. And if you still have problems, please do <a href="https://rt.cpan.org/Ticket/Create.html?Queue=SVN-Notify" title="Open a Ticket for SVN::Notify">let me know</a>.</p>

<h3>Output Filtering</h3>

<p>Much sexier is the addition of output filtering in SVN::Notify 2.70. I got pretty tired of getting feature requests for what are essentially formatting modifications, such as <a href="https://rt.cpan.org/Ticket/Display.html?id=26944" title="SVN::Notify feature requst for KDE keywords support">this one</a> requesting support for KDE-style <a href="http://techbase.kde.org/Policies/SVN_Commit_Policy#Special_keywords_in_SVN_log_messages" title="KDE TechBase: Special keywords in SVN log messages">keyword support</a>. I myself was using <a href="http://trac.edgewall.org/wiki/WikiFormatting" title="Trac Wiki Formatting Syntax">Trac wiki syntax</a> in commit messages on a <a href="http://iwantsandy.com/" title="Sandy: Your virtual personal assistant">recent project</a> and wanted to see them converted to HTML for messages output by SVN::Notify::HTML::ColorDiff.</p>

<p>So I finally sat down and gave some though on how to implement a simple plugin architecture for SVN::Notify. When I realized that it was generally just formatting that people wanted, it became simpler: I just needed a way to allow folks to write simple output filters. The solution I came up with was to just use Perl. Output filters are simply subroutines named for the kind of output they filter. They live in perl packages. That&#x2019;s it.</p>

<p>For example, say that your developers write their commit log messages in <a href="http://www.textism.com/tools/textile/" title="Textile">Textile</a>, and rather than receive them stuck inside <code>&lt;pre&gt;</code> tags, you&#x2019;d like them converted to HTML. It&#x2019;s simple. Just put this code in a Perl module file:</p>

<pre>
package SVN::Notify::Filter::Textile;
use Text::Textile ();

sub log_message {
    my ($notifier, $lines) = @_;
    return $lines unless $notify->content_type eq &#x0027;text/html&#x0027;;
    return [ Text::Textile->new->process( join $/, @$lines ) ];
}
</pre>

<p>Put the file, <em>SVN/Notify/Filter/Textile.pm</em> somewhere in a Perl library directory. Then use the new <code>&#x002d;&#x002d;filter</code> option to <code>svnnotify</code> to put it to work:</p>

<pre>
svnnotify -p "$1" -r "$2" &#x002d;&#x002d;handler HTML::ColorDiff &#x002d;&#x002d;filter Textile
</pre>

<p>Yep, that&#x2019;s it! SVN::Notify will find the filter module, load it, register its filtering subroutine, and then call it at the appropriate time. Of course, there are a lot of things you can filter; consult the  <a href="http://search.cpan.org/dist/SVN-Notify/lib/SVN/Notify/Filter.pm" title="SVN::Notify Output Filtering Documentation">complete documentation</a> for all of the details. But hopefully this gives you a flavor for how easy it is to write new filters for SVN::Notify. I&#x2019;m hoping that all those folks who want featurs can now stop bugging me and writing their own filters to do the job, and uploading them to CPAN for all to share!</p>

<p>To get things started, I scratched my own itch, writing a <a href="http://search.cpan.org/dist/SVN-Notify/lib/SVN/Notify/Filter/Trac.pm" title="SVN::Notify::Filter::Trac Documentation">Trac filter</a> myself. The filter is almost as simple as the Textile example above, but I also spent quite a bit of time tweaking the CSS so that most of the Trac-generated HTML looks good. You can see an example <a href="/computers/programming/perl/modules/svnnotify-2.70_trac_example.html" title="Example output from SVN::Notify 2.70 and modified by the Trac filter">right here</a>. Thanks to a number of bug fixes in  <a href="http://search.cpan.org/dist/Text-Trac/">Text::Trac</a>, as well as Trac-specific CSS added via a filter on CSS output, it works beautifully. If I&#x2019;m feeling motivated in the next week or so, I&#x2019;ll create a separate CPAN distribution with just a Markdown filter and upload it. That will create a nice distriution example for folks to copy to creat their own. Or maybe someone on the Lazy Web Will do it for me! Maybe <em>you?</em></p>

<p>I wish I&#x2019;d thought to do this from the beginning; it would have saved me from having to add so many features/cruft to SVN::Notify over the years. Here&#x2019;s a quick list of the features that likely could have been implemented via filters instead of added to the core:</p>

<ul>
  <li><code>&#x002d;&#x002d;user-domain</code>: Combine the SVN username with a domain for the <q>From</q> header.</li>
  <li><code>&#x002d;&#x002d;add-header</code>: Add a header to the message.</li>
  <li><code>&#x002d;&#x002d;reply-to</code>: Add a specific header to the message.</li>
  <li>SVN::Notify::HTML::ColorDiff: Frankly, looking back on it, I don&#x2019;t know why I didn&#x2019;t just put this support right into SVN::Notify::HTML. But even if I hadn&#x2019;t, it could have been implemented via filters.</li>
  <li><code>&#x002d;&#x002d;subject-prefix:</code>: Modify the message subject.</li>
  <li><code>&#x002d;&#x002d;subject-cx</code>: Add the commit context to the subject.</li>
  <li><code>&#x002d;&#x002d;strip-cx-regex</code>: More subject context modification.</li>
  <li><code>&#x002d;&#x002d;no-first-line</code>: Another subject filter.</li>
  <li><code>&#x002d;&#x002d;max-sub-length</code>: Yet another!</li>
  <li><code>&#x002d;&#x002d;max-diff-length</code>: A filter could truncate the diff, although this might be tricky with the HTML formatting.</li>
  <li><code>&#x002d;&#x002d;author-url</code>: Modify the metadata section to add a link to the author URL.</li>
  <li><code>&#x002d;&#x002d;revision-url</code>: Ditto for the revision URL.</li>
  <li><code>&#x002d;&#x002d;ticket-map</code>: Filter the log message for various ticketing system strings to convert to URLs. This also encompasses the old <code>&#x002d;&#x002d;rt-url</code>, <code>&#x002d;&#x002d;bugzilla-url</code>, <code>&#x002d;&#x002d;gnats-url</code>, and <code>&#x002d;&#x002d;jira-url</code> options.</li>
  <li><code>&#x002d;&#x002d;header</code>: Filter the beginning of the message.</li>
  <li><code>&#x002d;&#x002d;footer</code>: Filter the end of the message.</li>
  <li><code>&#x002d;&#x002d;linkize</code>: Filter the log message to convert URLs to links for HTML messages.</li>
  <li><code>&#x002d;&#x002d;css-url</code>: Filter the CSS to modify it, or filter the start of the HTML to add a link to an external CSS URL.</li>
  <li><code>&#x002d;&#x002d;wrap-log</code>: Reformat the log message for HTML.</li>
</ul>

<p>Yes, <em>really!</em> That&#x2019;s about half the functionality right there. I&#x2019;m glad that I won&#x2019;t have to add any more like that; filters are a <em>much</em> better way to go.</p>

<p>So download it, install it, write some filters, get your multibyte characters output properly, and enjoy! And as usual, send me your <a href="https://rt.cpan.org/Ticket/Create.html?Queue=SVN-Notify" title="Open a Ticket for SVN::Notify">bug reports</a>, but implement your own improvements using filters!</p>
]]></content:encoded>
  </item>

  <item rdf:about="http://justatheory.com/computers/os/macosx/cd_rom_filesystems.html">
    <title>Mac OS X CD-ROM File Systems WTF?</title>
    <link>http://justatheory.com/computers/os/macosx/cd_rom_filesystems.html</link>
    <description></description>
    <dc:subject>/computers/os/macosx</dc:subject>
    <dc:creator>David Wheeler</dc:creator>
    <dc:date>2008-02-11T00:19-08:00</dc:date>
    
    <cc:license rdf:resource="http://creativecommons.org/licenses/by-nc/2.0" />
    <content:encoded><![CDATA[<p>Didn&#x2019;t it used to be the case that when you used the Mac OS X Finder to
burn a CD-ROM that you could then mount that CD-ROM on a Windows box? In the
last few months, I&#x2019;m suddenly finding that this is no longer the case. So now
I have to use <code>hdiutil</code> to convert a <em>.dmg</em> file to the
Joliet and ISO9660 file systems:</p>

<pre>
hdiutil makehybrid -o image.iso -joliet -iso image.dmg
</pre>

<p>And <em>then</em> I could burn a CD readable on Windows. What the fuck? I
burned three CDs that were then useless to me before I finally dug
up <a href="http://www.macosxhints.com/article.php?query=dmg&amp;story=20050819185219196"
title="Mac OS X Hints: “DVD image manipulation via hdiutil”">this hint</a>.
And I had this problem with CDs burned by Tiger, too, last summer, so it&#x2019;s not
just Leopard. It seems to me that Mac OS X should always default to building a
hybrid CD that&#x2019;s then readable by Windows, Linux, and everything else. Why
doesn&#x2019;t it?</p>
]]></content:encoded>
  </item>

  <item rdf:about="http://justatheory.com/computers/internet/mail/need_imap_solution.html">
    <title>Need Suggestions for IMAP Solution and Migration</title>
    <link>http://justatheory.com/computers/internet/mail/need_imap_solution.html</link>
    <description></description>
    <dc:subject>/computers/internet/mail</dc:subject>
    <dc:creator>David Wheeler</dc:creator>
    <dc:date>2008-02-03T22:49-08:00</dc:date>
    
    <cc:license rdf:resource="http://creativecommons.org/licenses/by-nc/2.0" />
    <content:encoded><![CDATA[<p>For the last several years, I&#x2019;ve run
a <a href="http://www.courier-mta.org/imap/" title="Courier-IMAP
Home">Courier-IMAP</a> mail server for all of the mail for <a href="/"
title="Just a Theory">this site</a>, <a href="http://www.kineticode.com/"
title="Kineticode Home">Kineticode</a>, <a href="http://www.strongrrl.com/"
title="Strongrrl Home">Strongrrl</a> and other domains. We mainly used
Mail.app on Mac OS X to communicate with the server, and it worked really
well. Today,
<a href="http://www.strongrrl.com/" title="Julie Wheeler is principal at
Strongrrl">Julie</a> has over 3 GB of mail data, and I have around 1.5 GB, all
managed via IMAP.</p>

<p>Recently, I decided it was time to move the mail elsewhere. I&#x2019;ve been
meaning to do it for a while, primarily because the server I was using is now
used for the <a href="http://bricolage.cc" title="Bricolage CMS
Home">Bricolage</a> project, and because I never set up any spam filtering.
Julie was suddenly getting 100s of spam messages in her inbox. (It really
didn&#x2019;t help that she was still using Panther.) So on the advice of a good
friend who had been evaluating various mail services&#x2014;and who for now
shall go nameless and therefor blameless&#x2014;I moved all of our mail to
<a href="http://www.fusemail.com/" title="FuseMail Home">FuseMail</a>.</p>

<p>At first this seamed like a pretty good solution. Our spam rates went way
down, I could set up unlimited mail lists, aliases, and forwards, and
there was a migration tool that automated moving all of our existing
mail from the old IMAP server to the new one. There were some glitches with
the migration tool, but in the end all of our mail was moved and in tact.</p>

<p>But that&#x2019;s when I started to notice the issues. To summarize:</p>

<ul>
  <li>Mail put into the <q>Sent Items</q> folder by Mail.app was marked as
  unread. This didn&#x2019;t happen on the old server, and apparently has something
  to so with how FuseMail names the sent folder: <q>Sent Items</q> rather
  than <q>Sent Messages.</q></li>
  <li>Mail.app is syncing <em>constantly</em>. Even once it had successfully
  synced the all of our email in all of our IMAP folders (which took
  <em>days</em>, it is syncing all the time, to the extent that I am sometimes
  waiting for up to a minute to read a mail when I double-click it, because
  there are all these other threads doing stuff and taking up all the
  resources. It can take several minutes for mail I&#x2019;m sending to
  be <em>sent</em> (though that might be a delay in Mail.app copying the
  message to the Sent Items folder rather than the actual sending).</li>
  <li>Deleting mail takes for<em>ever!</em> This is probably the same issue as
  the syncing problem, but when I delete 1000s of messages from my Junk mail
  folder, it runs forever, and all other activities are delayed eve further.
  It turns out to be much more efficient to empty the Junk and Deleted Items
  folders using the webmail interface. And even then, Mail.app can take a
  while to delete locally-cached items from the folder when it syncs.</li>
  <li>Suddenly, Julie is getting a lot less spam. She went from several
  hundred messages showing up in her Junk mailbox a few days ago to just five
  on Friday and two yesterday&#x2014;one of which was a false positive). As she had
  been expecting a message from someone that she never got, this naturally
  made her very suspicious. Where is all the spam? Is she getting all of her
  mail?</li>
  <li>Since FuseMail uses a mailbox named <q>Sent Items</q> instead of the
  traditional <q>Sent Messages</q> for all sent mail, I asked if they could
  move the 1.8 GB of messages from Julie&#x2019;s Sent Messages to their Sent Items,
  since Mail.app would just choke on such a task. Though my request was
  escalated to the FuseMail developers, the answer came back <q>no.</q> Which
  I guess means that they&#x2019;re not
  using <a href="http://en.wikipedia.org/wiki/Maildir" title="Maildir as
  described by Wikipedia">Maildir</a>, because in that case it would be a
  cinch, n&#x2019;est pas?</li>
  <li>Backups are not really feasible. Of course FuseMail has its own
  backup regimen, but if I ever want to move elsewhere or deal with some
  sort of catastrophic failure, I want my own backups. There is no
  rsync service available for this (remember: no maildir), so I have to
  use the IMAP interface. I&#x2019;ve been trying for the past two weeks to get
  <a href="http://software.complete.org/offlineimap" title="OfflineIMAP
  Home">Offline IMAP</a> to back up all of Julie&#x2019;s and my mail, but it keeps
  choking. It gets a little further every time I run it; eventually it will
  get it all. But this only allows me to backup those accounts for which I
  happen to have a password. I have accounts set up for a few other users, but
  don&#x2019;t have access to their passwords, so I can&#x2019;t back them up. This does not
  make for very good support for corporate backup and retention
  policies.</li>
  <li>Mail forwarded by FuseMail has its <code>Return-Path</code> header
  modified. This made <a href="http://www.bestpractical.com/rt/"
  title="Request Tracker Home">RT</a> break until I hacked it to ignore that
  header (which is its by-default preferred header for identifying
  senders.</li>
</ul>

<p>So I&#x2019;m pretty fed up. It took me a week to get all of our mail on FuseMail,
and now I&#x2019;m looking at moving it off again (once OfflineIMAP finishes a full
sync). Grr. I&#x2019;m considering finding a virtual host somewhere and setting up my
own IMAP server again, but then I have the spam problem again. So then I could
use a forwarding service like <a href="http://www.pobox.com/" title="Pobox
Home">Pobox</a>, or I can set up my own spam filtering (something I had hoped
never to get into managing myself). My old IMAP server required very little
maintenance, which was nice, but then the span filtering stuff always seemed
daunting. Don&#x2019;t you have to update things all the time?a</p>

<p>But before I go off and do something else, and unlike before I moved to
FuseMail, I wanted to get an idea what other folks are doing? Do you use IMAP?
Do you use it to manage a shitload (read: Gigabytes) of mail? Do you get very
little spam and still get all of your valid mail? Are IMAP folder maintenance
actions fast for you (in Mail.app in particular)? Are you paying a
not-unreasonable amount of money for your setup? If you answered yes to all of
these questions, please, for the love of all that is good in this world, tell
me how you do it. I&#x2019;m looking for something that I don&#x2019;t have to work very
hard to maintain (hence my original attempt to have some company that
specializes in this stuff do it), but I&#x2019;ll do what I have to to make this
thing right. So how do you make it right? And if I have to run my own server,
where should I host it that won&#x2019;t cost me an arm and a leg?</p>

<p>Thanks for your help!</p>
]]></content:encoded>
  </item>

  <item rdf:about="http://justatheory.com/computers/databases/postgresql/recurring_events.html">
    <title>How to Generate Recurring Events in the Database</title>
    <link>http://justatheory.com/computers/databases/postgresql/recurring_events.html</link>
    <description></description>
    <dc:subject>/computers/databases/postgresql</dc:subject>
    <dc:creator>David Wheeler</dc:creator>
    <dc:date>2008-01-30T20:04-08:00</dc:date>
    
    <cc:license rdf:resource="http://creativecommons.org/licenses/by-nc/2.0" />
    <content:encoded><![CDATA[<p>This is a followup to
my <a href="/computers/databases/postgresql/reducing_view_calculations.html"
title="Need Help Reducing View Calculations">request for help</a> fixing the
performance of a database view that generated recurrences for events. This was
an essential feature of <a href="http://iwantsandy.com" title="Sandy — your
free personal assistant">Sandy</a>, and thus important to get right. The idea
when I started was simple:</p>

<ul>
  <li>Add a <code>recurrence</code> domain to the database that supports a
  number of different values, including <q>daily</q>, <q>weekly</q>,
  and <q>monthly</q>.</li>
  <li>Add a <code>recurrence</code> column to the <code>events</code> table
  that identify how an event recurs.</li>
  <li>Add a <code>recurrence_dates</code> table that contains a pre-generated
  list of recurrences for a given date. I&#x2019;d populated this table with five
  years of dates, each one mapped to five years worth of recurrence dates (see
  the original<a
  href="/computers/databases/postgresql/reducing_view_calculations.html"
  title="Need Help Reducing View Calculations">blog entry</a> for more on the
  format of this table.</li>
  <li>Create a view that maps each <code>events</code> row to its date and
  recurrence in the <code>recurrence_dates</code> table.</li>
  <li>Profit.</li>
</ul>

<p>It was this last bullet point that didn&#x2019;t quite work out: although the data
was perfectly accurate, queries for a lot of events in the view were
<em>very</em> expensive. I mean, the query could run for 3-4 minutes. It was
just crazy! I couldn&#x2019;t figure out the problem, so I posted my <a
href="/computers/databases/postgresql/reducing_view_calculations.html"
title="Need Help Reducing View Calculations">request for help</a>. It was
through discussions that followed with <a href="http://www.depesz.com/"
title="&lt;/depesz&gt; blog">depesz</a> that I finally figured out what the
problem was: Although I was usually selecting only a week&#x2019;s or months worth of
events, the view was calculating rows for all five years worth of data for all
of the events for a given user. Um, <em>not efficient.</em></p>

<p>So here I finally document how, with a lot of help and example code from
depesz, I solved the problem. The trick was to use a function instead of a
view to generate the recurring event rows, and to limit it only to the dates
we&#x2019;re interested in. For convenience sake, I broke this down into two PL/pgSQL
functions: one to generate recurring dates and one to return the recurring
event rows. But first, here&#x2019;s the <code>recurrence</code> domain and the
<code>events</code> table, both of which are unchanged from the original
approach:</p>

<pre>
CREATE DOMAIN recurrence AS TEXT
CHECK ( VALUE IN ( &#x0027;none&#x0027;, &#x0027;daily&#x0027;, &#x0027;weekly&#x0027;, &#x0027;monthly&#x0027; ) );

CREATE TABLE events (
    id         SERIAL     PRIMARY KEY,
    user_id    INTEGER    NOT NULL,
    starts_at  TIMESTAMP  NOT NULL,
    start_tz   TEXT       NOT NULL,
    ends_at    TIMESTAMP,
    end_tz     TEXT       NOT NULL,
    recurrence RECURRENCE NOT NULL DEFAULT &#x0027;none&#x0027;
);
</pre>

<p>Just assume the <code>user_id</code> is a foreign key. Now let&#x2019;s populate
this table with some data. For the purposes of this demonstration, I&#x2019;m going
to create one event per day for 1000 days, evenly divided between daily,
weekly, monthly, and no recurrences, as well as five different times of day
and six different durations:</p>

<pre>
INSERT INTO events (user_id, starts_at, start_tz, ends_at, end_tz, recurrence)
SELECT 1,
       ts::timestamp,
       &#x0027;PST8PDT&#x0027;,
       ts::timestamp + dur::interval,
       &#x0027;PST8PDT&#x0027;,
       recur
  FROM (
    SELECT &#x0027;2007-12-19&#x0027;::date + i || &#x0027; &#x0027; || CASE i % 5
               WHEN 0 THEN &#x0027;06:00&#x0027;
               WHEN 1 THEN &#x0027;10:00&#x0027;
               WHEN 2 THEN &#x0027;14:00&#x0027;
               WHEN 3 THEN &#x0027;18:00&#x0027;
               ELSE        &#x0027;22:30&#x0027;
               END,
           CASE i % 6
               WHEN 0 THEN &#x0027;2 hours&#x0027;
               WHEN 1 THEN &#x0027;1 hour&#x0027;
               WHEN 2 THEN &#x0027;45 minutes&#x0027;
               WHEN 3 THEN &#x0027;3.5 hours&#x0027;
               WHEN 4 THEN &#x0027;15 minutes&#x0027;
               ELSE        &#x0027;30 minutes&#x0027;
               END,
           CASE i % 4
               WHEN 0 THEN &#x0027;daily&#x0027;
               WHEN 1 THEN &#x0027;weekly&#x0027;
               WHEN 2 THEN &#x0027;monthly&#x0027;
               ELSE        &#x0027;none&#x0027;
               END
    FROM generate_series(1, 1000) as gen(i)
  ) AS ser( ts, dur, recur);
</pre>

<p>This gives us some nicely distributed data:</p>

<pre>
try=# select * from events limit 10;
  id  | user_id |      starts_at      | start_tz |       ends_at       | end_tz  | recurrence 
&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;
    1 |       1 | 2007-12-20 10:00:00 | PST8PDT  | 2007-12-20 11:00:00 | PST8PDT | weekly
    2 |       1 | 2007-12-21 14:00:00 | PST8PDT  | 2007-12-21 14:45:00 | PST8PDT | monthly
    3 |       1 | 2007-12-22 18:00:00 | PST8PDT  | 2007-12-22 21:30:00 | PST8PDT | none
    4 |       1 | 2007-12-23 22:30:00 | PST8PDT  | 2007-12-23 22:45:00 | PST8PDT | daily
    5 |       1 | 2007-12-24 06:00:00 | PST8PDT  | 2007-12-24 06:30:00 | PST8PDT | weekly
    6 |       1 | 2007-12-25 10:00:00 | PST8PDT  | 2007-12-25 12:00:00 | PST8PDT | monthly
    7 |       1 | 2007-12-26 14:00:00 | PST8PDT  | 2007-12-26 15:00:00 | PST8PDT | none
    8 |       1 | 2007-12-27 18:00:00 | PST8PDT  | 2007-12-27 18:45:00 | PST8PDT | daily
    9 |       1 | 2007-12-28 22:30:00 | PST8PDT  | 2007-12-29 02:00:00 | PST8PDT | weekly
   10 |       1 | 2007-12-29 06:00:00 | PST8PDT  | 2007-12-29 06:15:00 | PST8PDT | monthly
(10 rows)
</pre>

<p>Now let&#x2019;s get to the recurring date function:</p>

<pre>
CREATE OR REPLACE FUNCTION  generate_recurrences(
    recurs RECURRENCE, 
    start_date DATE,
    end_date DATE
)
    RETURNS setof DATE
    LANGUAGE plpgsql IMMUTABLE
    AS $BODY$
DECLARE
    next_date DATE := start_date;
    duration  INTERVAL;
    day       INTERVAL;
    check     TEXT;
BEGIN
    IF recurs = &#x0027;none&#x0027; THEN
        &#x002d;&#x002d; Only one date ever.
        RETURN next next_date;
    ELSIF recurs = &#x0027;weekly&#x0027; THEN
        duration := &#x0027;7 days&#x0027;::interval;
        WHILE next_date &lt;= end_date LOOP
            RETURN NEXT next_date;
            next_date := next_date + duration;
        END LOOP;
    ELSIF recurs = &#x0027;daily&#x0027; THEN
        duration := &#x0027;1 day&#x0027;::interval;
        WHILE next_date &lt;= end_date LOOP
            RETURN NEXT next_date;
            next_date := next_date + duration;
        END LOOP;
    ELSIF recurs = &#x0027;monthly&#x0027; THEN
        duration := &#x0027;27 days&#x0027;::interval;
        day      := &#x0027;1 day&#x0027;::interval;
        check    := to_char(start_date, &#x0027;DD&#x0027;);
        WHILE next_date &lt;= end_date LOOP
            RETURN NEXT next_date;
            next_date := next_date + duration;
            WHILE to_char(next_date, &#x0027;DD&#x0027;) &lt;&gt; check LOOP
                next_date := next_date + day;
            END LOOP;
        END LOOP;
    ELSE
        &#x002d;&#x002d; Someone needs to update this function, methinks.
        RAISE EXCEPTION &#x0027;Recurrence % not supported by generate_recurrences()&#x0027;, recurs;
    END IF;
END;
$BODY$;
</pre>

<p>The idea here is pretty simple: pass in a recurrence (<q>daily</q>,
<q>weekly</q>, or <q>monthly</q>), a start date, and an end date, and get
back a set of all the recurrence dates between the start and end dates:</p>

<pre>
try=# \timing
Timing is on.
try=# select * from  generate_recurrences(&#x0027;daily&#x0027;, &#x0027;2008&#x002d;01&#x002d;29&#x0027;, &#x0027;2008&#x002d;02&#x002d;05&#x0027;);
 generate_recurrences 
&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;
 2008&#x002d;01&#x002d;29
 2008&#x002d;01&#x002d;30
 2008&#x002d;01&#x002d;31
 2008&#x002d;02&#x002d;01
 2008&#x002d;02&#x002d;02
 2008&#x002d;02&#x002d;03
 2008&#x002d;02&#x002d;04
 2008&#x002d;02&#x002d;05
(8 rows)

Time: 0.548 ms
try=# select * from  generate_recurrences(&#x0027;weekly&#x0027;, &#x0027;2008&#x002d;01&#x002d;29&#x0027;, &#x0027;2008&#x002d;03&#x002d;05&#x0027;);
 generate_recurrences 
&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;
 2008&#x002d;01&#x002d;29
 2008&#x002d;02&#x002d;05
 2008&#x002d;02&#x002d;12
 2008&#x002d;02&#x002d;19
 2008&#x002d;02&#x002d;26
 2008&#x002d;03&#x002d;04
(6 rows)

Time: 0.670 ms
try=# select * from  generate_recurrences(&#x0027;monthly&#x0027;, &#x0027;2008&#x002d;01&#x002d;29&#x0027;, &#x0027;2008&#x002d;05&#x002d;05&#x0027;);
 generate_recurrences 
&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;
 2008&#x002d;01&#x002d;29
 2008&#x002d;02&#x002d;29
 2008&#x002d;03&#x002d;29
 2008&#x002d;04&#x002d;29
(4 rows)

Time: 0.644 ms
</pre>

<p>Not bad, eh? And PostgreSQL&#x2019;s date and interval calculation operators are
<a
href="http://www.depesz.com/index.php/2007/12/27/how-many-1sts-of-any-month-were-sundays-since-1901-01-01/"
title="&lt;depesz/&gt;: how many 1sts of any month were sundays - since
1901-01-01?"><em>wicked fast</em></a>. Check out how long it dates to generate
two years worth of daily recurrence dates:</p>

<pre>
try=# select * from  generate_recurrences(&#x0027;daily&#x0027;, &#x0027;2008-01-29&#x0027;, &#x0027;2010-02-05&#x0027;);
 generate_recurrences 
&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;
 2008-01-29
 2008-01-30
 2008-01-31
...
 2010-02-03
 2010-02-04
 2010-02-05
(739 rows)

Time: 4.982 ms
</pre>

<p>Awesome. And the great thing about this function is that any time I need to
add new recurrences (yearly, biweekly, quarterly, weekends, weekdays, etc.), I
just modify the domain and this function and we&#x2019;re ready to go.</p>

<p>And now, part two: the recurring event function:</p>

<pre>
CREATE OR REPLACE FUNCTION recurring_events_for(
   for_user_id INTEGER,
   range_start TIMESTAMP,
   range_end   TIMESTAMP
)
   RETURNS SETOF events
   LANGUAGE plpgsql STABLE
   AS $BODY$
DECLARE
   event events;
   start_date TIMESTAMPTZ;
   start_time TEXT;
   ends_at    TIMESTAMPTZ;
   next_date  DATE;
   recurs_at  TIMESTAMPTZ;
BEGIN
   FOR event IN 
       SELECT *
         FROM events
        WHERE user_id = for_user_id
          AND (
                  recurrence &lt;&gt; &#x0027;none&#x0027;
              OR  (
                     recurrence = &#x0027;none&#x0027;
                 AND starts_at BETWEEN range_start AND range_end
              )
          )
    LOOP
       IF event.recurrence = &#x0027;none&#x0027; THEN
         RETURN NEXT event;
         CONTINUE;
       END IF;

       start_date := event.starts_at::timestamptz AT TIME ZONE event.start_tz;
       start_time := start_date::time::text;
       ends_at    := event.ends_at::timestamptz AT TIME ZONE event.end_tz;

       FOR next_date IN
           SELECT *
             FROM generate_recurrences(
                      event.recurrence,
                      start_date::date,
                      (range_end AT TIME ZONE event.start_tz)::date
             )
       LOOP
           recurs_at := (next_date || &#x0027; &#x0027; || start_time)::timestamp
               AT TIME ZONE event.start_tz;
           EXIT WHEN recurs_at &gt; range_end;
           CONTINUE WHEN recurs_at &lt; range_start AND ends_at &lt; range_start;
           event.starts_at := recurs_at;
           event.ends_at   := ends_at;
           RETURN NEXT event;
       END LOOP;
   END LOOP;
   RETURN;
END;
$BODY$;
</pre>

<p>The idea here is to select the appropriate events for a given user between
two dates, and for each event iterate over all of the recurrences between the
two dates and return a row for each one. So the lines starting with <code>FOR
event IN</code> and ending with <code>LOOP</code> select the original events,
looking for either recurring events or non-recurring events that are between
the two dates. Note that if you needed to, you could easily refine this query
for your particular application, or even use
PL/pgSQL&#x2019;s <a href="http://www.postgresql.org/docs/8.3/static/plpgsql-control-structures.html#PLPGSQL-RECORDS-ITERATING"
title="PL/pgSQL: Looping Through Query Results"><code>EXECUTE</code></a>
operator to dynamically generate queries to suit particular application
needs.</p>

<p>Next, the block starting with <code>IF event.recurrence =
&#x0027;none&#x0027; THEN</code> simply returns any non-recurring events.
Although the next block already handles this case, adding this optimization
eliminates a fair bit of calculation for the common case of non-recurring
events.</p>

<p>Then the lines starting with <code>FOR next_date IN</code> and ending with
<code>LOOP</code> select all of the dates for the recurrence in question,
using the <code>generate_recurrences()</code> function created earlier. From
<code>LOOP</code> to <code>END LOOP;</code>, the function generates the start
and end timestamps, exiting the loop when the start date falls after the range
or when it falls before the range and the end date falls after the range.
There are many other tweaks one could make here to modify which recurrences
are included and which are excluded. For example, if you had a column in the
<code>events</code> table such as <code>exclude_dates TIMESTAMP[] NOT NULL
DEFAULT &#x2018;{}&#x2019;</code> that stored an array of dates to ignore when generating
recurrences, you could just add this line to go ahead and exclude them from
the results returned by the function:</p>

<pre>
           CONTINUE WHEN recurs_at = ANY( exclude_dates );
</pre>

<p>But enough of the details: let&#x2019;s see how it works! Here&#x2019;s a query for a
week&#x2019;s worth of data:</p>

<pre>
try=# select * from recurring_events_for(1, &#x0027;2007-12-19&#x0027;, &#x0027;2007-12-26&#x0027;);
  id  | user_id |      starts_at      | start_tz |       ends_at       | end_tz  | recurrence 
&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;
    1 |       1 | 2007-12-20 10:00:00 | PST8PDT  | 2007-12-20 03:00:00 | PST8PDT | weekly
    2 |       1 | 2007-12-21 14:00:00 | PST8PDT  | 2007-12-21 06:45:00 | PST8PDT | monthly
    3 |       1 | 2007-12-22 18:00:00 | PST8PDT  | 2007-12-22 21:30:00 | PST8PDT | none
    4 |       1 | 2007-12-23 22:30:00 | PST8PDT  | 2007-12-23 14:45:00 | PST8PDT | daily
    4 |       1 | 2007-12-24 22:30:00 | PST8PDT  | 2007-12-23 14:45:00 | PST8PDT | daily
    4 |       1 | 2007-12-25 22:30:00 | PST8PDT  | 2007-12-23 14:45:00 | PST8PDT | daily
    5 |       1 | 2007-12-24 06:00:00 | PST8PDT  | 2007-12-23 22:30:00 | PST8PDT | weekly
    6 |       1 | 2007-12-25 10:00:00 | PST8PDT  | 2007-12-25 04:00:00 | PST8PDT | monthly
(8 rows)

Time: 51.890 ms
</pre>

<p>Note the time it took to execute this query. 52 ms is a <em>hell</em> of a
lot faster than the several minutes it took to run a similar query using the
old view. Plus, I&#x2019;m not limited to just the recurrence dates I&#x2019;ve
pre-calculated in the old <code>recurrence_dates</code> table. Now we can use
whatever dates are supported by PostgreSQL. It&#x2019;s even fast when we look at a
year&#x2019;s worth of data:</p>

<pre>
try=# select * from recurring_events_for(1, &#x0027;2007-12-19&#x0027;, &#x0027;2008-12-19&#x0027;);
  id  | user_id |      starts_at      | start_tz |       ends_at       | end_tz  | recurrence 
&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;
    1 |       1 | 2007-12-20 10:00:00 | PST8PDT  | 2007-12-20 03:00:00 | PST8PDT | weekly
    1 |       1 | 2007-12-27 10:00:00 | PST8PDT  | 2007-12-20 03:00:00 | PST8PDT | weekly
    1 |       1 | 2008-01-03 10:00:00 | PST8PDT  | 2007-12-20 03:00:00 | PST8PDT | weekly
    1 |       1 | 2008-01-10 10:00:00 | PST8PDT  | 2007-12-20 03:00:00 | PST8PDT | weekly
    1 |       1 | 2008-01-17 10:00:00 | PST8PDT  | 2007-12-20 03:00:00 | PST8PDT | weekly
    1 |       1 | 2008-01-24 10:00:00 | PST8PDT  | 2007-12-20 03:00:00 | PST8PDT | weekly
    1 |       1 | 2008-01-31 10:00:00 | PST8PDT  | 2007-12-20 03:00:00 | PST8PDT | weekly
...
  364 |       1 | 2008-12-17 22:30:00 | PST8PDT  | 2008-12-17 14:45:00 | PST8PDT | daily
  364 |       1 | 2008-12-18 22:30:00 | PST8PDT  | 2008-12-17 14:45:00 | PST8PDT | daily
  365 |       1 | 2008-12-18 06:00:00 | PST8PDT  | 2008-12-17 22:30:00 | PST8PDT | weekly
(19691 rows)

Time: 837.759 ms
</pre>

<p>Not stellar, but still respectable. Given that for a typical application, a
user will be looking at only a day&#x2019;s or a week&#x2019;s or a month&#x2019;s events at a time,
this seems to be an acceptable trade-off. I mean, how often will your users
need to see a list of 20,000 events? And even if a user <em>was</em>looking at
a year&#x2019;s worth of data, it&#x2019;s unlikely that 75% of them would be recurring as
in the example data here.</p>

<p>I was fucking <em>pumped</em> with this solution, and Sandy has hummed
along nicely since we put it into production. If you&#x2019;re interested in trying
it for yourself, I&#x2019;ve you can get all the SQL from this blog entry <a
href="/code/recurring_events.sql" title="Download the code for this blog
entry">here</a>.</p>

<p>The only thing I would like to have been able to do differently was to
encapsulate the <code>recurring_events_for()</code> function in a view. Such
would have made it much easier to actually <em>use</em> this solution in
Rails. If you know how to do that, please do leave a comment. As for how I
hacked Rails to use the function, well, that&#x2019;s a blog post for another
day.</p>
]]></content:encoded>
  </item>

  <item rdf:about="http://justatheory.com/computers/internet/weblogs/teasers_atom_feed.html">
    <title>Teasers Only Atom Feed</title>
    <link>http://justatheory.com/computers/internet/weblogs/teasers_atom_feed.html</link>
    <description></description>
    <dc:subject>/computers/internet/weblogs</dc:subject>
    <dc:creator>David Wheeler</dc:creator>
    <dc:date>2008-01-30T19:13-08:00</dc:date>
    
    <cc:license rdf:resource="http://creativecommons.org/licenses/by-nc/2.0" />
    <content:encoded><![CDATA[<img src="/computers/internet/weblogs/feeds.png" alt="Select a feed" />

<p>I&#x2019;ve just added a new feed: <a href="/index.atomsum">teasers only</a>. It makes things a log shorter for those who just want to get a teaser for each blog entry, rather than <a href="/index.atom">complete entries</a>, such as <a href="http://planet.perl.org">Planet Perl</a> and <a href="http://www.planetpostgresql.org/">Planet PostgreSQL</a>.</p>

<p>Any questions or problems? Leave a comment. Thanks!</p>
]]></content:encoded>
  </item>

  <item rdf:about="http://justatheory.com/computers/programming/ruby/postgres_gem_on_leopard.html">
    <title>Using sudo to Install the Postgres Gem on Leopard</title>
    <link>http://justatheory.com/computers/programming/ruby/postgres_gem_on_leopard.html</link>
    <description></description>
    <dc:subject>/computers/programming/ruby</dc:subject>
    <dc:creator>David Wheeler</dc:creator>
    <dc:date>2008-01-22T01:22-08:00</dc:date>
    
    <cc:license rdf:resource="http://creativecommons.org/licenses/by-nc/2.0" />
    <content:encoded><![CDATA[<p>Been getting this error with the latest postgres gem?</p>

<pre>
% sudo gem install postgres
Bulk updating Gem source index for: http://gems.rubyforge.org
Building native extensions.  This could take a while...
ERROR:  While executing gem ... (Gem::Installer::ExtensionBuildError)
   ERROR: Failed to build gem native extension.

ruby extconf.rb install postgres
checking for main() in -lpq... yes
checking for libpq-fe.h... yes
checking for libpq/libpq-fs.h... yes
checking for PQsetClientEncoding()... no
checking for pg_encoding_to_char()... no
checking for PQfreemem()... no
checking for PQserverVersion()... no
checking for PQescapeString()... no
creating Makefile
</pre>

<p>I have, too. I&#x2019;ve known about the fix for a while, thanks
to <a href="http://rubyforge.org/pipermail/ruby-pg-general/2007-December/000004.html"
title="[Ruby-pg-general] osx leopard">a post from maintainer Jeff Davis</a>
from last month. But I was unable to get it to work. But then I found this gem
of a comment (pun not intended) from <a href="http://glu.ttono.us/articles/2007/12/22/postgresql-gem-on-leopard-stock-gem-system" title="Gluttonous: postgresql gem on Leopard stock gem system">Gluttonous</a>:</p>

<blockquote><p>FYI, this does NOT work with sudo since sudo strips the env var
out. You must ‘sudo -s’ or ‘sudo su’ and run the command straight
up.</p></blockquote>

<p>D&#x2019;oh! I&#x2019;ve been doing this all this time:</p>

<pre>
ARCHFLAGS=&#x0027;-arch i386&#x0027; sudo gem install postgres
</pre>

<p>And getting the same failures. But this works beautifully:</p>

<pre>
sudo env ARCHFLAGS=&#x0027;-arch i386&#x0027; gem install postgres
</pre>

<p>And away we go!</p>
]]></content:encoded>
  </item>

  <item rdf:about="http://justatheory.com/autobiographical/im_back.html">
    <title>I'm Ba-aack!</title>
    <link>http://justatheory.com/autobiographical/im_back.html</link>
    <description></description>
    <dc:subject>/autobiographical</dc:subject>
    <dc:creator>David Wheeler</dc:creator>
    <dc:date>2008-01-21T18:10-08:00</dc:date>
    
    <cc:license rdf:resource="http://creativecommons.org/licenses/by-nc/2.0" />
    <content:encoded><![CDATA[<a href="http://iwantsandy.com/"><img src="http://iwantsandy.com/images/flair/iwantsandy_207x207.gif" alt="Sandy - your free personal email assistant" style="border:0" /></a>

<p>Yes indeed, I am back. Was I ever gone? Well, yes, I&#x2019;ve been rather busy
for the last 15 months.</p>

<p>But December 31 was my last day at <a href="http://www.valuesofn.com/"
title="Values of n home page">Values of n</a>. I&#x2019;m really pleased with the
work I did there. <a href="http://iwantsandy.com/" title="You want sandy,
trust me!">Sandy</a> in particular, was a pleasure to work with. I really
think that the work that Rael and I did with Sandy has been important work.
Dare I say potentially paradigm-shifting? At any rate, I&#x2019;m convinced that
Sandy is really going to go places. If you haven&#x2019;t signed on to become her
client, do try. Though I will no longer be as intimate with her as I have in
the past, we&#x2019;re still going to keep in touch—I&#x2019;m still her client. And Rael
will do a great job pushing forward with her.</p>

<p>So why did I leave? Well, the truth is that, after
<a href="http://www.legacy.com/Link.asp?I=LS000091810982X" title="Obituary for
Steven Rappaport">Julie&#x2019;s dad died</a> in July, I found that I no longer had
the resources to commit to the 80-100 hours/week required to work in a
startup. It was just no longer that important to me. Don&#x2019;t get me wrong, it
was rewarding work, but my priorities completely realigned. It was vital that
I continue helping Rael to get Sandy&#x2019;s career launched, but once that was
done, it was time for me to move on.</p>

<p>And what am I doing now? Well, first and foremost, I&#x2019;m taking a few months
off. I&#x2019;m going to spend a lot more time re-aquainting myself with the two
terrific women with whom I share a house, and just generally reset myself.
Take a few deep breaths. Relax and enjoy life a bit. Sleep in now and then.
That sort of thing.</p>

<p>That&#x2019;s not to say that I&#x2019;ll be sitting on my ass all the time. I have a
very long list of things I want to do during this time, including catch up on
my blogging (hence the title of this post), fix some bugs in Bricolage and
help get 2.0 out the door, update my Perl libraries (I&#x2019;ve got some great ideas
for improving <a href="http://search.cpan.org/SVN-Notify" title="SVN::Notify
on CPAN">SVN::Notify</a>), finally get all my digital photos organized, etc. I
already spent much of last week revamping our mail system (I outsourced it
to <a href="http://www.fusemail.com/" title="FuseMail home
page">FuseMail</a>). And all that&#x2019;s leaving aside all the things Julie and I
want to get done around the house. That&#x2019;s the <em>really</em> important
stuff.</p>

<p>But do watch for more blog posts in the coming months, too. There are a few
interesting things I want to write about, and I&#x2019;ve got some serious
catching-up to do. Interested in following along on my
adventures? <a href="http://twitter.com/Theory" title="My Twitter page">Follow
me</a> via Twitter.</p>
]]></content:encoded>
  </item>

  <item rdf:about="http://justatheory.com/computers/databases/postgresql/timezone_validation.html">
    <title>Validating Time Zones in PostgreSQL</title>
    <link>http://justatheory.com/computers/databases/postgresql/timezone_validation.html</link>
    <description></description>
    <dc:subject>/computers/databases/postgresql</dc:subject>
    <dc:creator>David Wheeler</dc:creator>
    <dc:date>2007-11-07T21:03-08:00</dc:date>
    
    <cc:license rdf:resource="http://creativecommons.org/licenses/by-nc/2.0" />
    <content:encoded><![CDATA[<p>I recently needed to validate that a value stored in
a <code>TEXT</code>column was a valid time zone identifier. Why? Because I was
using its value inside the database to
<a href="/computers/databases/postgresql/reducing_view_calculations.html"
title="Need Help Reducing View Calculations">convert timestamp columns from
UTC to a valid zone</a>. So I set about writing a function I could use in a
constraint.</p>

<p>It turns out that PostgreSQL has a pretty nice view that lists all of the
time zones that it recognizes. It&#x2019;s called <code>pg_timezone_names</code>:</p>

<pre>
try=# select * from pg_timezone_names limit 5;
        name        | abbrev | utc_offset | is_dst 
&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;
 Africa/Abidjan     | GMT    | 00:00:00   | f
 Africa/Accra       | GMT    | 00:00:00   | f
 Africa/Addis_Ababa | EAT    | 03:00:00   | f
 Africa/Algiers     | CET    | 01:00:00   | f
 Africa/Asmara      | EAT    | 03:00:00   | f
(5 rows)
</pre>

<p>Cool. So all I had to do was to look up the value in this view. My first
stab at creating a time zone validation function therefore looked like
this:</p>

<pre>
CREATE OR REPLACE FUNCTION is_timezone( tz TEXT ) RETURNS BOOLEAN as $$
DECLARE
  bool BOOLEAN;
BEGIN
  SELECT TRUE INTO bool
    FROM pg_timezone_names
   WHERE LOWER(name) = LOWER(tz)
      OR LOWER(abbrev) = LOWER(tz);
  RETURN FOUND;
END;
$$ language plpgsql STABLE;
</pre>

<p>This should pretty well cover anything that PostgreSQL considers valid. So
does it work? You bet:</p>

<pre>
sandy_development=# \timing
Timing is on.
sandy_development=# select is_timezone(&#x2019;America/Los_Angeles&#x2019;);
 is_timezone 
&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;
 t
(1 row)

Time: 457.096 ms
sandy_development=# select is_timezone(&#x2019;Foo/Bar&#x2019;);
 is_timezone 
&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;
 f
(1 row)

Time: 472.752 ms
</pre>

<p>Perfect! Well, except for just one thing: performance is abysmal. A half
second per shot? Not very useful for constraints. And since
<code>pg_timezone_names</code> is a view (and, under that, a function), I
can&#x2019;t create indexes.</p>

<p>But then I did something dangerous: I started thinking. I realized that I
needed this function when our app started getting errors like this:</p>

<pre>
try=# select now() at time zone &#x2018;Foo/Bar&#x2019;;
ERROR:  time zone "Foo/Bar" not recognized
</pre>

<p>So the underlying C code throws an error when a time zone is invalid. What
if I could just trap the error? Well, PL/pgSQL conveniently has exception
handling, so I could do just that. But there was only one problem. PL/pgSQL&#x2019;s
exception handling syntax requires that you specify an error condition. Here&#x2019;s
what the documentation has:</p>

<pre>
EXCEPTION
    WHEN condition [ OR condition ... ] THEN
        handler_statements
    [ WHEN condition [ OR condition ... ] THEN
          handler_statements
      ... ]
END;
</pre>

<p>Conditions are
<a href="http://www.postgresql.org/docs/current/static/errcodes-appendix.html"
title="PostgreSQL Documentation: Appendix A. PostgreSQL Error Codes">error
codes</a>. But which one corresponds to the invalid time zone error? I tried a
few, but couldn&#x2019;t figure out which one. (Anyone know now to map errors you see
in <code>psql</code> to the error codes listed in Appendex A? Let me know!)
But really, my function just needed to do one thing. Couldn&#x2019;t I just trap any
old error?</p>

<p>A careful re-read of the PL/pgSQL documentation reveals that, yes, you can.
Use the condition <q>OTHERS,</q> and you can catch almost anything. With this
information in hand, I quickly wrote:</p>

<pre>
CREATE OR REPLACE FUNCTION is_timezone( tz TEXT ) RETURNS BOOLEAN as $$
DECLARE
  date TIMESTAMPTZ;
BEGIN
  date := now() AT TIME ZONE tz;
  RETURN TRUE;
EXCEPTION WHEN OTHERS THEN
  RETURN FALSE;
END;
$$ language plpgsql STABLE;
</pre>

<p>And how well does this one work?</p>

<pre>
sandy_development=# select is_timezone(&#x2019;America/Los_Angeles&#x2019;);
 is_timezone 
&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;
 t
(1 row)

Time: 3.009 ms
sandy_development=# select is_timezone(&#x2019;Foo/Bar&#x2019;);
 is_timezone 
&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;
 f
(1 row)

Time: 1.224 ms
</pre>

<p>Yes, I&#x2019;ll take 1-3 ms over 400-500 ms any day! I might even
<a href="http://www.postgresql.org/docs/current/static/sql-createdomain.html"
title="PostgreSQL Documentation: CREATE DOMAIN">create a domain</a> for this
and be done with it:</p>

<pre>
CREATE DOMAIN timezone AS TEXT
CHECK ( is_timezone( value ) );
</pre>

<p>Enjoy!</p>
]]></content:encoded>
  </item>

  <item rdf:about="http://justatheory.com/computers/databases/postgresql/reducing_view_calculations.html">
    <title>Need Help Reducing View Calculations</title>
    <link>http://justatheory.com/computers/databases/postgresql/reducing_view_calculations.html</link>
    <description></description>
    <dc:subject>/computers/databases/postgresql</dc:subject>
    <dc:creator>David Wheeler</dc:creator>
    <dc:date>2007-11-07T05:22-08:00</dc:date>
    
    <cc:license rdf:resource="http://creativecommons.org/licenses/by-nc/2.0" />
    <content:encoded><![CDATA[<p>I could use some advice and suggestions for how to solve a performance
problem due to the highly redundant calculation of values in a view. Sorry for
the longish explanation. I wanted to make sure that I omitted no details in
desribing the problem.</p>

<p>In order to support recurring events in an application I&#x2019;m working on, we
have a lookup table that maps dates to their daily, weekly, monthly, and
yearly recurrences. It looks something like this:</p>

<pre>
try=# \d recurrence_dates
   Table &quot;public.recurrence_dates&quot;
   Column   |    Type    | Modifiers 
&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;
 date       | date       | not null
 recurrence | recurrence | not null
 next_date  | date       | not null
Indexes:
    &quot;recurrence_dates_pkey&quot; PRIMARY KEY, btree (date, recurrence, next_date)
    &quot;index_recurrence_dates_on_date_and_recurrence&quot; btree (date, recurrence)

try=# select * from recurrence_dates
try-# where date = &#x0027;2007-11-04&#x0027;
try-# order by recurrence, next_date;
    date    | recurrence | next_date  
&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;+&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;&#x002d;
 2007-11-04 | daily      | 2007-11-04
 2007-11-04 | daily      | 2007-11-05
 2007-11-04 | daily      | 2007-11-06
 2007-11-04 | weekly     | 2007-11-04
 2007-11-04 | weekly     | 2007-11-11
 2007-11-04 | weekly     | 2007-11-18
 2007-11-04 | monthly    | 2007-11-04
 2007-11-04 | monthly    | 2007-12-04
 2007-11-04 | monthly    | 2008-01-04
 2007-11-04 | annually   | 2007-11-04
 2007-11-04 | annually   | 2008-11-04
 2007-11-04 | annually   | 2009-11-04
 2007-11-04 | none       | 2007-11-04
</pre>

<p>To get all of the permutations of recurring events, we simply select from a
view rather than from the <code>events</code> table that contains the actual
event data. The view joins <code>events</code>
to <code>recurrence_dates</code> table like so:</p>

<pre>
CREATE OR REPLACE VIEW recurring_events AS
SELECT id, name, user_id, duration,
       (rd.next_date || &#x0027; &#x0027; ||
       (starts_at::timestamptz at time zone start_tz)::time)::timestamp
       at time zone start_tz AS starts_at,
       start_tz
  FROM events LEFT JOIN recurrence_dates rd
    ON (events.starts_at::timestamptz at time zone events.start_tz)::date = rd.date
   AND events.recurrence = rd.recurrence;
</pre>

<p>Then, to get all of the recurrences of events for a user within a week, we
do something like this in the client code:</p>

<pre>
SELECT *
  FROM recurring_events
 WHERE user_id = 2
   AND starts_at BETWEEN &#x0027;2007-11-04 07:00:00&#x0027; AND &#x0027;2007-11-10 07:59:59&#x0027;;
</pre>

<p>This works perfectly, as all of our dates and times are stored in UTC
in <code>timestamp</code> columns. We pass UTC times for the appropriate
offset to the query (Pacific Time in this example) and, because the view does
the right thing in mapping the <code>starts_at</code> time for each event to
its proper time zone, we get all of the events within the date range, even if
they are recurrences of an earlier event, and with their times properly
set.</p>

<p>The trouble we&#x2019;re having, however, is all of those conversions. Until last
week, the view just kept everything in UTC and left it to the client to
convert to the proper zone in the <code>start_tz</code> column. But that
didn&#x2019;t work so well when an event&#x2019;s <code>starts_at</code> was during daylight
savings time and recurrences were in standard time: the standard time
recurrences were all an hour off! So I added the repeated instances
of <code>events.starts_at::timestamptz at time zone events.start_tz</code>.
But now the view is <em>really</em> slow.</p>

<p>Since the only thing that has changed is the addition of the time zone
conversions, I believe that the performance penalty is because of them. The
calculation executes multiple times per row: once for the join and once again
for the <code>starts_at</code> column. We can have an awful lot of events for
a given user, and an awful lot of recurrences of a given event. If, for
example, an event recurs daily for 2 years, there will be around 730 rows for
that one event. And the calculation has to be executed for every one of them
before the <code>WHERE</code> clause can be properly evaluated. Ouch! Worse
still, we actually have <em>three</em> columns that do this in our
application, not just one as in the example here.</p>

<p>So what I need is a way to execute that calculation just once for each
row in the <code>events</code> table, rather than once for each row in the
<code>recurring_events</code> view. I figure 1 calculation will be a heck of a
lot faster than 730! So the question is, how do I do this? How do I get the
view to execute the conversion of the <code>starts_at</code> to
the <code>start_tz</code> time zone only once for each row
in <code>events</code>, regardless of how many rows it ends up generating in
the <code>recurring_events</code> view?</p>

<p>Suggestions warmly welcomed. This is a bit of a tickler for me, and
since the query performance on these views is killing us, I need to get
this adjusted post haste!</p>

<p>Meanwhile, tomorrow I&#x2019;ll post a cool hack I came up with for validating
time zones in the database. Something to look forward to as you ponder my
little puzzle, eh?</p>

<p><strong>Update 2008-01-30:</strong> <em>Thanks to help from depesz, I came
figured out what the underling problem was and solved it much more elegantly
using PL/pgSQL. I&#x2019;ve now <a
href="/computers/databases/postgresql/recurring_events.html" title="How to
Generate Recurring Events in the Database">written up the basic recipe.
Enjoy!</em></p>
]]></content:encoded>
  </item>

  <item rdf:about="http://justatheory.com/computers/os/macosx/ical_invite_file_location.html">
    <title>Where iCal Keeps Invitations</title>
    <link>http://justatheory.com/computers/os/macosx/ical_invite_file_location.html</link>
    <description></description>
    <dc:subject>/computers/os/macosx</dc:subject>
    <dc:creator>David Wheeler</dc:creator>
    <dc:date>2007-09-14T17:57-08:00</dc:date>
    
    <cc:license rdf:resource="http://creativecommons.org/licenses/by-nc/2.0" />
    <content:encoded><![CDATA[<p>I was fiddling with iCalendar invitations yesterday, trying to
get <a href="http://www.iwantsandy.com/" title="Meet Sandy: Your Personal
Email Assistant">Sandy</a>&#x2019;s .ics files to import into Outlook. I got that
figured out (yes!), but in the meantime iCal started crashing on me. I was
reasonable sure that it was due to a bogus invitation file, but could not for
the life of me figure out where iCal was keeping such files. It just kept
crashing on me as second or so after starting up, every time.</p>

<p>I finally figured it out by quitting all my apps, moving all of the folders
in <em>~/Library</em> to a temporary folder, and firing up iCal to see what
folds it would create. And there it
was: <em>~/Library/Caches/com.apple.iCal</em>. I quit iCal, deleted the new
folders in <em>~/Library</em>, moved the originals back, and looked inside the
iCal caches folder to find a bunch of invitation files in
the <em>incoming</em> folder. I deleted them all and iCal fired up again
without a hitch. W00t!</p>

<p>So if you&#x2019;re having problems with iCal crashing and have a few invitations
in it and you&#x2019;re wondering how to get iCal to ignore them, just quit iCal,
delete all of the files
in <em>/Users/yourusername/Library/Caches/com.apple.iCal/incoming</em>, and
start iCal back up again.</p>

<p>And now I&#x2019;ll be able to find this information again when next I need it.
:-)</p>
]]></content:encoded>
  </item>

  <item rdf:about="http://justatheory.com/computers/programming/ruby/time_zone_bug.html">
    <title>Ruby Time Object Time Zone Bug</title>
    <link>http://justatheory.com/computers/programming/ruby/time_zone_bug.html</link>
    <description></description>
    <dc:subject>/computers/programming/ruby</dc:subject>
    <dc:creator>David Wheeler</dc:creator>
    <dc:date>2007-08-29T19:06-08:00</dc:date>
    
    <cc:license rdf:resource="http://creativecommons.org/licenses/by-nc/2.0" />
    <content:encoded><![CDATA[<p><a href="http://rubyforge.org/tracker/?func=detail&amp;atid=1698&amp;aid=6368&amp;group_id=426" title="[ ruby-Bugs-6368 ] Time Changes Zones">This is disappointing</a>.</p>

<p>To summarize, Ruby&#x2019;s <code>Time</code> class has a bug in its <code>zone</code> method. The example is simple:</p>

<pre>
tz = ENV[&#x0027;TZ&#x0027;]
ENV[&#x0027;TZ&#x0027;] = &#x0027;Africa/Luanda&#x0027;
t = Time.now
puts t.zone
ENV[&#x0027;TZ&#x0027;] = &#x0027;Australia/Lord_Howe&#x0027;
puts t.zone
</pre>

<p>This outputs:</p>

<pre>
WAT
WAT
</pre>

<p>So far so good. But look what happens when I add a single line to the program, <code>foo = t.to_s</code>:</p>

<pre>
tz = ENV[&#x0027;TZ&#x0027;]
ENV[&#x0027;TZ&#x0027;] = &#x0027;Africa/Luanda&#x0027;
t = Time.now
puts t.zone
ENV[&#x0027;TZ&#x0027;] = &#x0027;Australia/Lord_Howe&#x0027;
foo = t.to_s
puts t.zone
</pre>

<p>The result changes:</p>

<pre>
WAT
LHST
</pre>

<p>This is clearly wrong. Changing the <code>$TZ</code> environment variable and stringifying the object should not change the underlying value of any of the object&#x2019;s attributes. The <code>Time</code> object should remember the value of the time zone when it is initialized, and should never change.</p>

<p>Unfortunately, The Ruby core developers (or at least the owner of the bug report) feel that, since <code>Time</code> relies on the system C API, and since time zones are a PITA, it&#x2019;s not worth it to change this behavior. The downside, however, is that you cannot rely on <code>Time</code> zones to ever be correct unless you&#x2019;re very, very careful.</p>

<p>Personally, in my subclass of <code>Time</code>, I took care of stashing the time zone at object instantiation as a workaround for this bug. It seemed reasonable to me, and I was just surprised that the idea was rejected by the Ruby developers.</p>

<p>What do you think?</p>
]]></content:encoded>
  </item>

  <item rdf:about="http://justatheory.com/computers/internet/weblogs/new_comment_policy.html">
    <title>New Just a Theory Blog Policy: Limited Comment Period</title>
    <link>http://justatheory.com/computers/internet/weblogs/new_comment_policy.html</link>
    <description></description>
    <dc:subject>/computers/internet/weblogs</dc:subject>
    <dc:creator>David Wheeler</dc:creator>
    <dc:date>2007-07-21T22:12-08:00</dc:date>
    
    <cc:license rdf:resource="http://creativecommons.org/licenses/by-nc/2.0" />
    <content:encoded><![CDATA[<p>I&#x2019;ve had an open policy on comments on this blog since it started. A couple years ago, I added a timeout on trackback pings, so that you can&#x2019;t trackback ping a posting more than two weeks after I wrote it. But I left manual comments in, along with the simple math bit, since comments and spam have been low volume.</p>

<p>Curiously, though, although this is not a popular blog, and I&#x2019;ve posted to it all of twice in the last six months, I&#x2019;ve been getting a lot more comment spam in the last few weeks. I&#x2019;ve been having to manually delete upwards of 100 spam comments a day. Well, I&#x2019;m bored with that. So I hereby announce a new comment policy: You can comment on a blog post for up to two weeks after I post it. After that, the comment period will be over. I&#x2019;m sorry to have to do this, and maybe it will change if I ever switch to Word Press or something, but for now, I think it will do.</p>

<p>The vast majority of non-spam comments I get on any particular post after two weeks or so is a request for support. So I don&#x2019;t think that the new policy will hamper anyone much, and for those looking for support, well, this is not the appropriate forum. But if you <em>do</em> feel compelled to comment on something after the comment period, just email your comment to me and I&#x2019;ll add it in as I deem appropriate.</p>

<p>Thanks for understanding. I really appreciate getting this time back every day. And, of course, if you&#x2019;d like to respond to this new policy in any way, well, you have two weeks to leave a comment on this post. ;-)</p>
]]></content:encoded>
  </item>

  <item rdf:about="http://justatheory.com/computers/programming/ruby/array_to_hash_one_liner.html">
    <title>Array to Hash One-Liner</title>
    <link>http://justatheory.com/computers/programming/ruby/array_to_hash_one_liner.html</link>
    <description></description>
    <dc:subject>/computers/programming/ruby</dc:subject>
    <dc:creator>David Wheeler</dc:creator>
    <dc:date>2007-07-19T18:08-08:00</dc:date>
    
    <cc:license rdf:resource="http://creativecommons.org/licenses/by-nc/2.0" />
    <content:encoded><![CDATA[<p>Programming in Ruby, I&#x2019;ve badly missed Perl&#x2019;s list syntax, which, among other things, makes converting between arrays and hashes really easy. In Ruby I have forever been converting an array to a hash like this:</p>

<pre>
a = [ 1, 2, 3, 4, 5 ]
h = {}
a.each { |v| h[v] = v }
</pre>

<p>Of course, this is anything bug concise. In Perl, I can just do this:</p>

<pre>
my @a = (1, 2, 3, 4, 5, 6);
my %h = map { $_ => $_ } @a;
</pre>

<p>Easy, huh? Well, I finally got fed up with the nasty hack in Ruby, did a little Googling, and figured out a way to do it in a single line:</p>

<pre>
a = [ 1, 2, 3, 4, 5, 6 ]
h = Hash[ *a.collect { |v| [ v, v ] }.flatten ]
</pre>

<p>Not quite as consice as the Perl version, and I have to construct a bunch of arrays that I then throw away with the call to <code>flatten</code>, but at least it&#x2019;s concise and, I think, clearer what it&#x2019;s doing. So I think I&#x2019;ll go with that.</p>
]]></content:encoded>
  </item>

  <cc:License rdf:about="http://creativecommons.org/licenses/by-nc/2.0/">
    <cc:permits rdf:resource="http://web.resource.org/cc/Reproduction" />
    <cc:permits rdf:resource="http://web.resource.org/cc/Distribution" />
    <cc:requires rdf:resource="http://web.resource.org/cc/Notice" />
    <cc:requires rdf:resource="http://web.resource.org/cc/Attribution" />
    <cc:prohibits rdf:resource="http://web.resource.org/cc/CommercialUse" />
    <cc:permits rdf:resource="http://web.resource.org/cc/DerivativeWorks" />
  </cc:License>
</rdf:RDF>
