Moving Towards Bricolage 2.0

Today I’ve finished just about over two and a half weeks of hacking on Bricolage. It has been a couple of years since I gave it much attention, but there was so much good stuff that other people have contributed that, since I had a little time, it seemed worth it to give it some love. So here’s a quick list of all that I’ve done in the last two weeks:

  • Fixed all reported issues with Bricolage 1.10. Scott Lanning kindly released 1.10.5 yesterday with all of those fixes.

  • I integrated the element occurrence branch that Christian Muise had worked on as his 2006 Google Summer of Code project. Christian’s project added support for maximum and mininum specfications for subelements in Bricolage, which allows administrators to define how many fields and elements can occur in a story or media document. All I had to do was add a few UI tweaks to support the new fields and their specification in the story profile, and all was ready to go. Oh, and I did have to go back and make the SOAP interface work with the feature, but the only reason it never did was lazy hacking of the SOAP interface (way before Christian’s time). Nice work, Christian, and thank you for your contribution!

  • I fixed a few bugs with Arsu Andrei’s port of Bricolage to MySQL, which was his 2006 Google Summer of Code project. Arsu did a terrific job with the port, with only a few minor things missed that he likely could not have caught anyway. This work had already been merged into the trunk. Thanks Arsu!

  • I fixed a bunch of bugs from Marshall Roch’s AJAXification of Bricolage, carried out during his 2006 Google Summer of Code project. Marshall actually did a lot more stuff than he’d planned, as it all went quite smoothly. I found only a few minor oversights that I was able to eaily address. This work represents the single most visible change to how users user Bricolage since we launched the probject back in 2001. Editing stories, in particular, is now a lot cleaner, with far fewer page loads. Thanks a million, Marshall!

  • I completed the work started by Chris Heiland of the University of Washington, Bothell, and Scott Lanning of the World Health Organization to port Bricolage to Apache 2. They really did most of the hard work, and I just spent several days integrating everything, making sure all the features work, and updating the installer to handle differences in configuration. I thought this would take me a day or two, but it actually took the better part of a week! So much has changed, but in truth Bricolage is now better for running on mod_perl 2. Expect to see Apache 2 bet the recommended platform for Bricolage in a release in the near future.

  • I integrated a number of patches from Brian Smith of Gossamer Threads to allow the installer to be run as a non-root user. The key here is if the installer has to become the database super user, which is required for ident authentication, and of course whether files are to be installed somewhere on the system requiring super user access. This work is not done, yet, as make upgrade and make uninstall are not quite there yet. But we’re getting there, and it should be all done in time for 2.0, thanks to Brian.

  • I added support for a whole slew of environment variables to the installer. Now you can set environment variables to override default settings for installation parameters, such as choice of RDBMS, Apache, location of an SSL cert and key, whether to support SLL, and lots of other stuff, besides. This is all documented in the Quick Installation Instructions section of Bric::Admin/INSTALL.

  • I fully tested and fixed a lot of bugs leftover from making the installer database- and Apache-neutral. Now all of these commands should work perfectly:

    • make
    • make cpan
    • make test
    • make install
    • make devtest
    • make clone
    • make uninstall
  • I improved the DHTML functionality of the Add More widget, which is used to add contact information to users and contributors, rules to alert types, and extensions to media types. I think it’s pretty slick, now! This was built on Marshall’s AJAX work.

All of these changes have been integrated into the Bricolage trunk and I’ve pushed out a developer release today. Please do check out all the goodness on a test box and send feedback or file bug reports! There are only a couple of other features waiting to go into Bricolage before we start the release candidate process. And, oh yeah, tht title of this blog post? It’s not a lie. The next production release of Bricolage, based on all this work, will be Bricolage 2.0. Enough of the features we’d planned for Bricolage lo these many years ago are in the trunk that the new version number is warranted. I for one will be thrilled to see 2.0 ship in time for OSCON.

And in case it isn’t already clear, many thanks to the Google Summer of Code and participating students for the great contributions! This release would not have been possible without them.

Also in the news today, the Bricolage server has been replaced! The new server, which hosts the Web site, the wiki and the instance of Bricolage used to manage the site itself, is graciously provided by the kind folks at Gossamer Threads. The server is Gossamer Threads’s way of giving back to the Bricolage community as they prepare to launch a hosted Bricolage solution. Thaks GT!

The old Bricolage server was provided by pair Networds for the last five years. I’d just like to thank pair for the generous five-year loan of that box, which helped provided infrastructure for both Bricolage and Kineticode. Thank you, pair!

And with that, I’m going heads-down on some other projects. I’ll pop back up to make sure that Bricolage 2.0 is released in a few months, but otherwise, I’m on to other things again for a while. Watch this space for details!

How to Globally Change a Subversion Username

I successfully migrated the Kineticode Subversion repository to a new server yesterday. Everything works great. But after my first commit, I realized that, while my username on the old server was theory, on the new server it’s david. Subversion works fine, of course, and I was able to start committing from old checkouts using the new username, but I realized that sites like Ohloh would pick up the two usernames as separate usernames. So I wanted to update all of the 3630 existing revisions that were mine to use the new username.

Unfortunately, I couldn’t find much on how to do this in a quick Googling. But I quickly figured out that what I need to do was to svnadmin dump my repository, modify the dump, and then load it again. The Subversion dump format has all these fields for tracking the content-lengths of various, so doing the update was a bit tricky. But I wrote the script here to track things, and it worked great for me. So here it is for others to reference and use.

#!/usr/bin/perl -w

use strict;
use warnings;

while (<>) {
    print;
    next unless /^Revision-number:\s+\d+$/;

    # Grab the content lengths. Examples:
    # Prop-content-length: 139
    # Content-length: 139
    my $plen_line = <>;
    my $clen_line = <>;

    unless ( $plen_line =~ /^Prop-content-length:\s+\d+$/ ) {
        # Nothing we want to change.
        print $plen_line, $clen_line;
        next;
    }

    my @lines;
    while ( <> ) {
        if ( /^PROPS-END$/ ) {
            # finish.
            print $plen_line, $clen_line, @lines, $_;
            last;
        }

        push @lines, $_;

        if ( /^svn:author$/ ) {
            # Grab the author content length. Example:
            # V 6
            my $alen_line = <>;

            # Grab the author name.
            my $auth = <>;

            if ( $auth =~ s/^theory$/david/ ) {
                # Adjust the content lengths.
                for my $line ( $plen_line, $clen_line, $alen_line ) {
                    $line =~ s/(\d+)$/$1 - 1/e;
                }
            }
            print $plen_line, $clen_line, @lines, $alen_line, $auth;
            last;
        }
    }
}

To use it, save it to a file, say svn_author, then change line 40 to your old and new usernames. Then, on line 43, change the $1 - 1 bit to be correct for the difference between the usernames you’re changing. For example, if you’re changing your username from, say, shane to chromatic, the new name is five characters longer, so you’d make it $1 + 5.

Now, run it like so:

svnadmin dump /path/to/svnroot > svndump.out
perl svn_author svndump.out > svndump.in
svnadmin create /path/to/new/svnroot
svnadmin load /path/to/new/svnroot < svndump.in

And that’s it! Feel free to take this code and do with it what you like, including fix any bugs, add command-line options, support changing multiple authors at once, or whatever. Share and enjoy.

SVN::Notify 2.70: Output Filtering and Character Encoding

I’m very pleased to announce the release of SVN::Notify 2.70. You can see an example of its colordiff output here. This is a major release that I’ve spent the last several weeks polishing and tweaking to get just right. There are quite a few changes, but the two most important are imporoved character encoding support and output filtering.

Improved Character Encoding Support

I’ve had a number of bug reports regarding issues with character encodings. Particularly for folks working in Europe and Asia, but really for anyone using multibyte characters in their source code and log messages (and we all do nowadays, don’t we?), it has been difficult to find the proper incantation to get SVN::Notify to convert data from and to their proper encodings. Using a patch from Toshikazu Kinkoh as a starting-point, and with a lot of reading and experimentation, as well as regular and patient tests on Toshikazu’s and Martin Lindhe’s production systems, I think I’ve finally got it nailed down.

Now you can use the --encoding (formerly --charset), --svn-encoding, and --diff-encoding options—as well as --language—to get SVN::Notify to do the right thing. As long as your Subversion server’s OS supports an appropriate locale, you should be golden (mine is old, with no UTF-8 locales :\). And if all else fails, you can still set the $LANG environment variable before executing svnnotify.

There is actually a fair bit to know about encodings to get it to work properly, but if you use UTF-8 throughout and your OS supports UTF-8 locales, you shouldn’t have to do anything. You might have to set --language in order to get it to use the proper locale. See the new documentation of the encoding support for all the details. And if you still have problems, please do let me know.

Output Filtering

Much sexier is the addition of output filtering in SVN::Notify 2.70. I got pretty tired of getting feature requests for what are essentially formatting modifications, such as this one requesting support for KDE-style keyword support. I myself was using Trac wiki syntax in commit messages on a recent project and wanted to see them converted to HTML for messages output by SVN::Notify::HTML::ColorDiff.

So I finally sat down and gave some though on how to implement a simple plugin architecture for SVN::Notify. When I realized that it was generally just formatting that people wanted, it became simpler: I just needed a way to allow folks to write simple output filters. The solution I came up with was to just use Perl. Output filters are simply subroutines named for the kind of output they filter. They live in perl packages. That’s it.

For example, say that your developers write their commit log messages in Textile, and rather than receive them stuck inside <pre> tags, you’d like them converted to HTML. It’s simple. Just put this code in a Perl module file:

package SVN::Notify::Filter::Textile;
use Text::Textile ();

sub log_message {
    my ($notifier, $lines) = @_;
    return $lines unless $notify->content_type eq 'text/html';
    return [ Text::Textile->new->process( join $/, @$lines ) ];
}

Put the file, SVN/Notify/Filter/Textile.pm somewhere in a Perl library directory. Then use the new --filter option to svnnotify to put it to work:

svnnotify -p "$1" -r "$2" --handler HTML::ColorDiff --filter Textile

Yep, that’s it! SVN::Notify will find the filter module, load it, register its filtering subroutine, and then call it at the appropriate time. Of course, there are a lot of things you can filter; consult the complete documentation for all of the details. But hopefully this gives you a flavor for how easy it is to write new filters for SVN::Notify. I’m hoping that all those folks who want featurs can now stop bugging me and writing their own filters to do the job, and uploading them to CPAN for all to share!

To get things started, I scratched my own itch, writing a Trac filter myself. The filter is almost as simple as the Textile example above, but I also spent quite a bit of time tweaking the CSS so that most of the Trac-generated HTML looks good. You can see an example right here. Thanks to a number of bug fixes in Text::Trac, as well as Trac-specific CSS added via a filter on CSS output, it works beautifully. If I’m feeling motivated in the next week or so, I’ll create a separate CPAN distribution with just a Markdown filter and upload it. That will create a nice distriution example for folks to copy to creat their own. Or maybe someone on the Lazy Web Will do it for me! Maybe you?

I wish I’d thought to do this from the beginning; it would have saved me from having to add so many features/cruft to SVN::Notify over the years. Here’s a quick list of the features that likely could have been implemented via filters instead of added to the core:

  • --user-domain: Combine the SVN username with a domain for the From header.
  • --add-header: Add a header to the message.
  • --reply-to: Add a specific header to the message.
  • SVN::Notify::HTML::ColorDiff: Frankly, looking back on it, I don’t know why I didn’t just put this support right into SVN::Notify::HTML. But even if I hadn’t, it could have been implemented via filters.
  • --subject-prefix:: Modify the message subject.
  • --subject-cx: Add the commit context to the subject.
  • --strip-cx-regex: More subject context modification.
  • --no-first-line: Another subject filter.
  • --max-sub-length: Yet another!
  • --max-diff-length: A filter could truncate the diff, although this might be tricky with the HTML formatting.
  • --author-url: Modify the metadata section to add a link to the author URL.
  • --revision-url: Ditto for the revision URL.
  • --ticket-map: Filter the log message for various ticketing system strings to convert to URLs. This also encompasses the old --rt-url, --bugzilla-url, --gnats-url, and --jira-url options.
  • --header: Filter the beginning of the message.
  • --footer: Filter the end of the message.
  • --linkize: Filter the log message to convert URLs to links for HTML messages.
  • --css-url: Filter the CSS to modify it, or filter the start of the HTML to add a link to an external CSS URL.
  • --wrap-log: Reformat the log message for HTML.

Yes, really! That’s about half the functionality right there. I’m glad that I won’t have to add any more like that; filters are a much better way to go.

So download it, install it, write some filters, get your multibyte characters output properly, and enjoy! And as usual, send me your bug reports, but implement your own improvements using filters!

Mac OS X CD-ROM File Systems WTF?

Didn’t it used to be the case that when you used the Mac OS X Finder to burn a CD-ROM that you could then mount that CD-ROM on a Windows box? In the last few months, I’m suddenly finding that this is no longer the case. So now I have to use hdiutil to convert a .dmg file to the Joliet and ISO9660 file systems:

hdiutil makehybrid -o image.iso -joliet -iso image.dmg

And then I could burn a CD readable on Windows. What the fuck? I burned three CDs that were then useless to me before I finally dug up this hint. And I had this problem with CDs burned by Tiger, too, last summer, so it’s not just Leopard. It seems to me that Mac OS X should always default to building a hybrid CD that’s then readable by Windows, Linux, and everything else. Why doesn’t it?

Need Suggestions for IMAP Solution and Migration

For the last several years, I’ve run a Courier-IMAP mail server for all of the mail for this site, Kineticode, Strongrrl and other domains. We mainly used Mail.app on Mac OS X to communicate with the server, and it worked really well. Today, Julie has over 3 GB of mail data, and I have around 1.5 GB, all managed via IMAP.

Recently, I decided it was time to move the mail elsewhere. I’ve been meaning to do it for a while, primarily because the server I was using is now used for the Bricolage project, and because I never set up any spam filtering. Julie was suddenly getting 100s of spam messages in her inbox. (It really didn’t help that she was still using Panther.) So on the advice of a good friend who had been evaluating various mail services—and who for now shall go nameless and therefor blameless—I moved all of our mail to FuseMail.

At first this seamed like a pretty good solution. Our spam rates went way down, I could set up unlimited mail lists, aliases, and forwards, and there was a migration tool that automated moving all of our existing mail from the old IMAP server to the new one. There were some glitches with the migration tool, but in the end all of our mail was moved and in tact.

But that’s when I started to notice the issues. To summarize:

  • Mail put into the Sent Items folder by Mail.app was marked as unread. This didn’t happen on the old server, and apparently has something to so with how FuseMail names the sent folder: Sent Items rather than Sent Messages.
  • Mail.app is syncing constantly. Even once it had successfully synced the all of our email in all of our IMAP folders (which took days, it is syncing all the time, to the extent that I am sometimes waiting for up to a minute to read a mail when I double-click it, because there are all these other threads doing stuff and taking up all the resources. It can take several minutes for mail I’m sending to be sent (though that might be a delay in Mail.app copying the message to the Sent Items folder rather than the actual sending).
  • Deleting mail takes forever! This is probably the same issue as the syncing problem, but when I delete 1000s of messages from my Junk mail folder, it runs forever, and all other activities are delayed eve further. It turns out to be much more efficient to empty the Junk and Deleted Items folders using the webmail interface. And even then, Mail.app can take a while to delete locally-cached items from the folder when it syncs.
  • Suddenly, Julie is getting a lot less spam. She went from several hundred messages showing up in her Junk mailbox a few days ago to just five on Friday and two yesterday—one of which was a false positive). As she had been expecting a message from someone that she never got, this naturally made her very suspicious. Where is all the spam? Is she getting all of her mail?
  • Since FuseMail uses a mailbox named Sent Items instead of the traditional Sent Messages for all sent mail, I asked if they could move the 1.8 GB of messages from Julie’s Sent Messages to their Sent Items, since Mail.app would just choke on such a task. Though my request was escalated to the FuseMail developers, the answer came back no. Which I guess means that they’re not using Maildir, because in that case it would be a cinch, n’est pas?
  • Backups are not really feasible. Of course FuseMail has its own backup regimen, but if I ever want to move elsewhere or deal with some sort of catastrophic failure, I want my own backups. There is no rsync service available for this (remember: no maildir), so I have to use the IMAP interface. I’ve been trying for the past two weeks to get Offline IMAP to back up all of Julie’s and my mail, but it keeps choking. It gets a little further every time I run it; eventually it will get it all. But this only allows me to backup those accounts for which I happen to have a password. I have accounts set up for a few other users, but don’t have access to their passwords, so I can’t back them up. This does not make for very good support for corporate backup and retention policies.
  • Mail forwarded by FuseMail has its Return-Path header modified. This made RT break until I hacked it to ignore that header (which is its by-default preferred header for identifying senders.

So I’m pretty fed up. It took me a week to get all of our mail on FuseMail, and now I’m looking at moving it off again (once OfflineIMAP finishes a full sync). Grr. I’m considering finding a virtual host somewhere and setting up my own IMAP server again, but then I have the spam problem again. So then I could use a forwarding service like Pobox, or I can set up my own spam filtering (something I had hoped never to get into managing myself). My old IMAP server required very little maintenance, which was nice, but then the span filtering stuff always seemed daunting. Don’t you have to update things all the time?a

But before I go off and do something else, and unlike before I moved to FuseMail, I wanted to get an idea what other folks are doing? Do you use IMAP? Do you use it to manage a shitload (read: Gigabytes) of mail? Do you get very little spam and still get all of your valid mail? Are IMAP folder maintenance actions fast for you (in Mail.app in particular)? Are you paying a not-unreasonable amount of money for your setup? If you answered yes to all of these questions, please, for the love of all that is good in this world, tell me how you do it. I’m looking for something that I don’t have to work very hard to maintain (hence my original attempt to have some company that specializes in this stuff do it), but I’ll do what I have to to make this thing right. So how do you make it right? And if I have to run my own server, where should I host it that won’t cost me an arm and a leg?

Thanks for your help!

Powered by KinoSearch