Just a Theory

By David E. Wheeler

Howard Fineman's “Analysis”

My father-in-law, Steven, sent me this link to a Newsweek column by Howard Fineman. Like Steven, I thought it very interesting that a conservative columnist would be basically saying that the election is all but over for Bush, given the past week’s news. But the funny thing is, I didn’t know that Fineman was conservative until I read that column. What gave it away?

It was this snippet:

On one level, Kerry’s “position” is a contradictory bundle of confusion. He says the war was a mistake, but he’s the guy calling for a gung-ho strategy in Fallujah to root out terrorist nests. As the president has pointed out, Kerry is claiming he can win the support of allies even as he dismisses the contributions of existing ones and calls the entire war a diversion–and even as France and Germany already have said that they aren’t going to rally to our side if Kerry wins. But if the situation in Iraq continues to deteriorate, Kerry’s “vision”–or lack of it–matters less.

This seems typical of conservative commentary–it’s a very selective description of Kerry’s position. Yes, Kerry says that the war was a mistake, but now that we’re in it, we need to do it right, including getting tough on rooting out the terrorists (who, by the way, only came into the country after the war started). Kerry has not dismissed the contributions of existing allies, but has pointed out that, unlike Desert Storm, this coalition is far from evenly divided. As Edwards repeatedly said during the Veep debate, the US bears 90% of the cost among the coalition members, both in terms of dollars and in terms of lives. There is no contradiction in these statements. The contradiction only comes up if they’re used selectively and outside of appropriate contexts.

I am so sick of this hypocrisy! I keep telling people, I can’t wait to be disappointed in Kerry’s presidency, as I was with Clinton’s. I’ll take disappointment over being offended by the President and his apologists any day!

Looking for the comments? Try the old layout.

Bush Uses Radio Receiver During Debate?

HOLY SHIT!

According to a story in Salon.com, it appears that George W. Bush may well have been wearing a radio transmitter during the first debate. This would be so that he could get prompts from someone more knowledgeable (Dick Cheney?).

Current Electoral Vote Predictor (which currently shows Kerry leading 280 to 239!) has confirmed the presence of “the bulge” with this image, using Red Hawk image intensification software.

My favorite phrase from the Salon.com article: the “Milli Vanilli president.”

Looking for the comments? Try the old layout.

Compiling libreadline on Mac OS X

I just realized that I never posted my recipe for configuring and installing libreadline on Mac OS X. I need it for use with PostgreSQL, and don’t fully understand why Apple has not yet included it with Mac OS X. Maybe it’ll be in Tiger?

In the meantime, it turns out to be pretty easy to configure it and build it yourself, assuming you have the developer tools (“Xcode”) installed. The only thing that’s different from any other Unix is that the support/shobj-conf must be modified to be able to find other libraries installed on Mac OS X. Here’s a shell script I whipped up that can do the whole thing for you, soup-to-nuts.

#!/usr/bin/sh
export VERSION=4.3
curl -O ftp://ftp.gnu.org/pub/gnu/readline/readline-$VERSION.tar.gz
tar zxvf readline-$VERSION.tar.gz
cd readline-$VERSION
perl -i.bak -p -e \
  "s/SHLIB_LIBS=.*/SHLIB_LIBS='-lSystem -lncurses -lcc_dynamic'/g" \
  support/shobj-conf
./configure
make
sudo make install

Hope that this helps others!

Looking for the comments? Try the old layout.

More about…

SVN::Notify 2.10 Generalizes Behavior

It’s all Autrijus’ fault.

As I mentioned last week when I released SVN::Notify 2.0, Autrijus has suggested using SVN::Notify as the base class for modules that do other things, such as send instant messages or update a checkout for backup purposes. Instantly seeing the value in this, I further realized that I could greatly simplify the support for HTML notification emails by moving the HTML-specific code to a subclass and then just let polymorphism do the work.

The result SVN::Notify 2.10. To simplify the move to a subclass for the HTML notifications, I broke up the old send() method into a large number of other methods that affect various parts of the composition of the email, such as headers, starting the message, outputting the log message, the file list, and outputting or attaching the diff. Then I just overrode the few methods that need different behavior in the subclass, and it all worked!

I realized, as I worked on it, I also realized that I was following the same principals that Ovid has written about with regard to the use of if. I was able to remove quite a few of them by moving HTML to a subclass. Of course, there are still some to enable diffs to be either included in an email or attached, but I didn’t want to split things up too much, or I’d have a geometric explosion of subclasses!

The svnnotify script, in the meantime, remains largely unmodified. The only change is the deprecation of the --format option in favor of a new option, --handler. Use this option to specify what subclass of SVN::Notify should handle the notification. So far, there’s just one, --format HTML, but I’m sure that Autrijus will soon add --format Jabber, and I’d like to add --format HTML::ColorDiff, myself. I might have to move the processing of command-line arguments out of svnnotify and into SVN::Notify, instead, so that subclasses can add new options. We’ll see what comes up.

Other changes to SVN::Notify include:

  • Added code to Build.PL to set the shebang line in the test scripts. Reported by Robert Spier.
  • Changed name of attached diff file to be named for the revision and the committer, rather than the committer and the date. Suggested by Robert Spier.
  • Added Author, Date, and Revision information to the top of each message.
  • The ViewCVS URL is no longer output for each file. A single link for the entire revision number is put at the top of the email, instead. ViewCVS Revision URL syntax pointed out by Peter Valdemar Morch.
  • Changed the send() method to execute() to better reflect its generalized use as the method that executes actions in response to Subversion activity.
  • The tests no longer require HTML::Entities to run. The HTML email tests will be skipped if it is not installed.
  • Added accessor methods for the attributes of SVN::Notify.

Enjoy!

Looking for the comments? Try the old layout.

On Making a Better Open-Source CMS

Jeffrey Veen posted some of his thoughts on the (dreary) state of open-source CMSs. So I just thought I’d comment on his thoughts. Keep your salt grains handy.

As a CMS developer, my point of view is quite naturally biased. However, I agree with some of Jeffrey’s points, and disagree with others. Well, not disagree so much as wish to qualify. Many of your points betray a certain perspective that does not (and, naturally, cannot) apply to anyone and everyone who is evaluating content management systems. So let me just try to address each of your points and how they related to Bricolage.

Make it easy to install. Well, yes, of course, and I’ll be the first to admit that Bricolage is difficult to install. But your requirement that you be able to install it from the browser just isn’t feasible with a CMS that aims to scale to the needs of a large organization. The security implications alone make give me the heebee-jeebies. It’s fine if you want to just manage your own personal site, but not if you’re aiming to serve the complex needs of the corporate marketplace.

That said, it might be reasonable to create a simple installer that’s useful for doing a local evaluation of a major CMS, one that doesn’t rely on an RDBMS and an Apache Web server installation. (RT (not a CMS) has been working on a simple executable that uses an embedded database and Web server for those who want to evaluate it. For Bricolage, we at one time had a KNOPPIX CD that one could use to try it out. But for the rest, the best solution is probably an RPM, BSD Package, Debian package, or the like–something that can integrate the application into the base operating system.

Make it easy to get started. Bricolage is pretty bad about this, mostly because the default templates that come with it suck. That will change eventually, but the bigger issue is that when you have a complex, flexible application, it’s tricky to present a simple getting started configuration without locking the user into just using that configuration (witness all the identical Microsoft Home Page sites on the ‘Net to see what I mean). But that’s no excuse for a system like Bricolage–those who need the more advanced features could take advantage of them when they’re ready. It’s just a matter of finding the tuits (or the volunteer) to make it happen.

Write task-based documentation first. In my experience, most complex open-source applications that have task-based documentation have it when they author a book. Yes, most of these systems have grown organically, and documentation gets written as volunteers make the time. But the best documentation I’ve found for open-source software has tended to be in published books. Though I think that trend is gradually changing.

Separate the administration of the CMS from the editing and managing of content. In Bricolage, you do not have to switch accounts to have access to the administrative tools. And although the administrative tools are part of the same UI, they have an entirely different section and set of navigation menus. Users who don’t have permission to use those tools don’t notice them.

Users of a public web site should never–never–be presented with a way to log into the CMS. Amen, brotha.

Stop it with the jargon already. Finding good terminology is hard, hard work. Bricolage is broken in this respect in a few ways, but I’m thinking of replacing “jobs” with “fembots” in 2.0. What do you think?

But seriously, we try to match the terms to what is commonly used, such as “document”, “site”, “category”, “keyword”, “template”, etc. Other terms, such as “element” (the parts of a document are its elements) are well-integrated into the system, so that users pick up on it very quickly.

Why do you insist Web sites have columns? I’m in complete agreement with you here. In Bricolage, you can write templates to output any kind of content you want, in any format you want. If you want columns, fine, generate them from Bricolage. If you want a standards-compliant layout, generate it from Bricolage. You can have 1998-era tables with Flash and you can have RSS feeds. Do what you want, make it flexible or make it complex.

But note that the flexibility comes at the price of complexity. And try as we might to make Bricolage “The Macintosh of Content Management Systems,” as long as the definition of the term “Content Management System” is all over the map, commonalities of metaphors, interfaces, and, well, philosophies between CMSs will continue to be all over the map.

But that’s just my opinion.

Looking for the comments? Try the old layout.

SVN::Notify 2.0 Hitting CPAN

My latest Perl module, SVN::Notify 2.00, has hit CPAN. This is a port of my widely-used activitymail CVS notification script to Subversion. But it underwent quite a few changes over the port, including:

Modularization
The old monolithic activitymail script is gone. It has been replaced with a Perl class, SVN::Notify, that does most of the work. The new script, svnnotify, is essentially just a wrapper around the class; all it does is process command-line arguments and then pass the results to SVN::Notify.
Simplification
Subversion’s system for hooking in to commit transactions is far better thought-out than that of CVS. It’s now easy to capture the results of an entire commit in a single transaction, without having to write out temp files to keep track of where we are and to concatenate diffs. As a result, SVN::Notify has a much simpler architecture and implementation that requires fewer third-party modules to do its work. In addition, the move to a class should make it much easier to build on SVN::Notify in the future than it was with activitymail. Autrijus Tang already suggested a number of ideas on IRC, including SVN::Notify::Jabber or SVN::Notify::Export. Have at it, everyone!
Reduced Resource Usage
I had heard some complaints that, on very large commits, activitymail could end up taking up a huge amount of memory. As best I could figure, this was because it was loading everything into memory, including the diff for the commit! SVN::Notify avoids this problem by using a file handle to read in a diff an print it to sendmail one line at a time. This should keep resource usage by SVN::Notify way below what activitymail used.
Context-Specific Notifications
SVN::Notify has added support for mapping email addresses to regular expressions. Whenever a regular expression matches the name of one or more of the directories affected in a single commit, the corresponding email address will be added to the list of recipients of the notification. This is a great way to get notification messages sent to particular email addressed based on what part of the Subversion tree was affected by a commit. I intend to use this to set it up so that a list of translators only get notification about a commit when it changes a directory related to localization in my projects, so that they can ignore commits to other parts of the application.

These are the major changes, but SVN::Notify also features a number of smaller improvements over its activitymail ancestor, including character set support, user domain support for the “From” header, explicit specification of a “From” header, properly escaped content when sending HTML-formatted notifications, and a maximum subject length configuration.

So what did it lose? Just a few things:

  • syncmail-like behavior. Did anyone ever use this? If so, feel free to implement SVN::Notify::Syncmail.
  • Arguments to diff. SVN::Notify just uses svnlook diff to generate a diff. Support for other diffs could be added in a future version, if people really need it.
  • New directories and imports can no longer be ingored, because in Subversion they’re really no different from any other commit.
  • Limit on the maximum size of the email. This is because SVN::Notify no longer loads the entire email into memory to measure it.
  • Excluding certain files from the diff. Subversion handles this itself by paying attention to the media type of each file.
  • Windows support. Actually, I’m not sure if activitymail was ever used on Windows, but the new method of using pipes to communicate with other processes isn’t supported by Windows, as near as I can tell. There are comments in the code for those who wish to do the port; it would probably be easy using Win32::Process.

Not too much, eh? Let me know what you think, and send feedback!

Looking for the comments? Try the old layout.

Lessons Learned with Perl and UTF-8

I learned quite a lot last week as I was making Bricolage much more Unicode-aware. Bricolage has always managed Unicode content and stored it in a PostgreSQL Unicode-encoded database. And by “Unicode” I of course mean “UTF-8”. By far the biggest nightmare was figuring out the bug with Apache::Util::escape_html(), but ultimately it came down to an interesting lesson.

Why was I making Bricolage Unicode-aware? Well, it all started with a bug report from Kang-min Liu (a.k.a. “Gugod”). I had naïvely thought that if strings were Unicode that Perl would know it and do the right thing. It turns out I was wrong. Perl assumes that everything is binary unless you tell it otherwise. This means that Perl operators such as length and substr will count bytes instead of characters. And in the case of Unicode, where characters can be multiple bytes, this can cause serious problems. Not only were strings improperly concatenated mid-character for Gugod, but PostgreSQL could refuse to accept such strings, since a chopped-up multibyte character isn’t valid Unicode!

So I had to make some decisions: Either stop using Perl operators that count bytes, or let Perl know that all the strings that Bricolage deals with are Unicode strings. The former wasn’t really an option, of course, since users can specify that certain content fields be a certain length of characters. So with a lot of testing help from Gugod and his Bricolage install full of multibyte characters, I set about doing so. The result is in the recently released Bricolage 1.8.2 and I’m blogging what I learned for both your reference and mine.

Perl considers its internal representation of strings to be UTF-8 strings, and it knows what variables contain valid UTF-8 strings because they have a special flag set on them, called, strangely enough, utf8. This flag isn’t set by default, but can be set in a number of ways. The ways I’ve found so far are:

  • Using Encode::decode() to decode a string from binary to Perl’s internal representation. The use of the word “decode” here had confused me for a while, because I thought it was a special encoding. But the truth is that it’s not. Strings can have any number of encodings, such as “ISO-8859-1”, “GB3212”, “EUC-KR”, “UTF-8”, and the like. But when you “decode” a string, you’re telling Perl that it’s not any of those encodings, but Perl’s own representation. I was confused because Perl’s internal representation is UTF-8, which is an encoding. But really it’s not UTF-8, It’s “utf8”, which isn’t an encoding, but Perl’s own thing.

  • Cheat: Use Encode::_set_utf8_on(). This private function is nevertheless documented by the Encode module, and therefore usable. What it does is simply turn on the utf8 flag on a variable. You need be confident that the variable contains only valid UTF-8 characters, but if it does, then you should be pretty safe.

  • Using the three-argument version of open, such as

    open my $fh, "<utf8", "/foo/bar" or die "Cannot open file: $!\n"

    Now when you read lines from this file, they will automatically be decoded to utf8.

  • Using binmode to set the mode on a file handle:

    binmode $fh, ":utf8";

    As with the three-argument version of open this forces Perl to decode the strings read from the file handle.

  • use utf8;. This Perl pragma indicates that everything within its scope is UTF-8, and therefore should be decoded to utf8.

So I started applying these approaches in various places. The first thing I did was to set the utf8 flag on data coming from the browser with Encode::_set_utf8_on(). Shitty browsers can of course send shitty data, but I’m deciding, for the moment at least, to trust browser to send only UTF-8 when I tell them that’s what I want. This solved Gugod’s immediate problem, and I happily closed the bug. But then he started to run into places where strings appeared properly in some places but not in others. We spent an entire day (night for Gugod–I really appreciated the help!) tracking down the problem, and there turned out to be two of them. One was the the bug with Apache::Util::escape_html() that I’ve [described elsewhere]the bug with Apache::Util::escape_html(), but the other proved more interesting.

It seems that if you concatenate a UTF-8 string with the utf8 flagged turned on with a UTF-8 string without utf8 turned on, the text in the unflagged variable turns to crap! I have no idea why this is, but Gugod noticed that strings pulled into the UI from the Bricolage zh_tw localization library simply didn’t display properly. I had him add use utf8; to the zh_tw module, and the problem went away!

So the lesson learned here is: If you’re going to make Perl strings Unicode-aware, then all of your Perl strings need to be Unicode-aware. It’s an all or nothing kind of thing.

So while setting the utf8 flag on browser submits and adding use utf8; to the localization modules got us part of the way toward a solution, it turned out to be trickier than I expected to get the utf8 flag set on everything. The places I needed to get it working were in the UI Mason components, in templates, and in strings pulled from the database.

It took a bit of research, but I think I successfully figured out how to make the UI Mason components UTF-8 aware. I just added preamble => "use utf8\n;" to the creation of the Mason interpretor. This gets passed on to is compiler, and now that string is added to the beginning of every template. This made things behave better in the UI. I applied the same approach to the interpreter created for Mason templates with equal success.

I’m less confident that I pulled it off for the HTML::Template and Template Toolkit templating architectures. In a discussion on the templates mailing list, Andy Wardley suggested that it wasn’t currently possible. But I wasn’t so sure. It seemed to me that, since Bricolage reads in the templates and asks TT to execute them within a certain scope, that I could just set the mode to utf8 on the file handle and then execute the template within the scope of a use utf8; statement. So that’s what I did. Feedback on whether it works or not would be warmly welcomed.

I tried a similar approach with the HTML::Template burner. Again, the burner reads the templates from files and passes them to HTML::Template for execution (as near as I could tell, anyway; I’m not an HTML::Template template user). Hopefully it’ll just work.

So that just left the database. Since the database is Unicode-only, all I needed to do was to turn on the utf8 flag for all content pulled from the database. Amazingly, this hasn’t come up as an issue for people very much, because DBI doesn’t do anything about Unicode. I picked up an older discussion started by Matt Sergeant on the dbi-dev mail list, but it looks like it might be a while before DBI has fast, integrated support for turning utf8 on and off for various database handles and columns. I look forward to it, though, because it’s likely to be very efficient. I greatly look forward to seeing the results of Tim’s work in the next release of DBI. I opened another bug report to remind myself to take advantage of the new feature when it’s ready.

So in the meantime, I needed to find another solution. Fortunately, my fellow PostgreSQL users had run into it before, and added what I needed to DBD::Pg back in version 1.22. The pg_enable_utf8 database handle parameter forces the utf8 flag to be turned on for all string data returned from the database. I added this parameter to Bricolage, and now all data pulled from the database is utf8. And so are the UI components, templates, localization libraries, and data submitted from browsers. I think that nailed everything, but I know that Unicode issues are a slippery slope. I can’t wait until I have to deal with them again!

Not.

Looking for the comments? Try the old layout.

More about…

Bricolage 1.8.2 Released

The Bricolage development team is pleased to announce the release of Bricolage 1.8.2. This maintenance release addresses quite a large number of issues in Bricolage 1.8.1. The most important changes were to enhance Unicode support in Bricolage. Bricolage now internally handles all text content as UTF-8 strings, thus enabling templates to better control the manipulation of multibyte characters. Other changes include better performance for searches using the ANY() operators and more intelligent transaction handling for distribution jobs. Here are the other highlights of this release:

Improvements

  • Bricolage now runs under a DSO mod_perl as long as it uses a Perl compiled with -Uusemymalloc or -Ubincompat5005. See The mod_perl FAQ for details.
  • Alerts triggered to be sent to users who don’t have the appropriate contact information will now be logged for those users so that they can see them and acknowledge them under “My Alerts”.
  • Added bric_media_dump script to contrib/.
  • The category association interface used in the story profile when the ENABLE_CATEGORY_BROWSER bricolage.conf directive is enabled now uses radio buttons instead of a link to select the primary category. Suggested by Scott Lanning.
  • Existing jobs are now executed within their own transactions, as opposed to no transaction specification. This means that each job must succeed or fail independent of any other jobs. New jobs are executed before being inserted into the database so as to keep them atomic within their surrounding transaction (generally a UI request). All this means that transactionality is much more intelligent for jobs and will hopefully eliminate job table deadlocks.
  • All templates now execute with UTF-8 character strings enabled. This means that any templates that convert content to other character sets might need to change the way they do so. For example, templates that had used <%filter> blocks to convert content to another encoding using something like Encode::from_to($_, 'utf-8', $encoding) must now use something like $_ = Encode::encode($encoding, $_), instead. Bric::Util::CharTrans should continue to do the right thing.
  • Added encoding attribute to Bric::Util::Burner so that, if templates are outputting something other than Perl utf8 decoded data, they can specify what they’re outputting, and the file opened for output from the templates will be set to the proper mode. Applies to Perl 5.8.0 and later only.
  • Added SFTP_HOME bricolage.conf directive to specify the home directory and location of SSH keys when SSH is enabled.

Bug Fixes

  • make clone once again properly copies the lib/Makefile.PL and bin/Makefile.PL files from the source directory.
  • Added missing language-specifying HTML attributes so as to properly localize story titles and the like.
  • The list of output channels to add to an element in the element profile now contains the name of the site that each is associated with, since different sites can have output channels with the same names.
  • The “Advanced Search” interface once again works for searching for related story and media documents.
  • Bricolage no longer attempts to email alerts to an empty list of recipients. This will make your SMTP server happier.
  • The version numbering issues of Bricolage modules have all been worked out after the confusion in 1.8.1. This incidentally allows the HTML::Template and Template Toolkit burners to be available again.
  • Misspelling the name of a key name tag or including a non-repeatable field more than once in Super Bulk Edit no longer causes all of the changes in that screen to be lost.
  • When a user overrides the global “Date/Time Format” and “Time Zone” preferences, the affects of the overrides are now properly reflected in the UI.
  • Publishing a story or media document along with its related story or media documents from a publish desk again correctly publishes the original asset as well as the relateds.
  • Deleted output channels no longer show up in the select list for story type and media type elements.
  • Deleting a workflow from the workflow manager now properly updates the workflow cache so that the deleted workflow is removed from the left navigation without a restart.
  • When Bricolage notices that a document or template is not in workflow or on a desk when it should be, it is now more intelligent in trying to select the correct workflow and/or desk to put it on, based on current workflow context and user permissions.
  • Content submitted to Bricolage in the UTF-8 character set is now always has the utf8 flag set on the Perl strings that store it. This allows fields that have a maximum length to be truncated to that length in characters instead of bytes.
  • Elements with autopopulated fields (e.g., for image documents) can now be created via the SOAP interface.
  • Fixed a number of the parameters to the list() method of the Story, Media, and Template classes to properly handle an argument using the ANY operator. These include the keyword and category_uri parameters. Passing an ANY argument to these parameters before this release could cause a well-populated database to lock up with an impossible query for hours at a time.
  • Template sandboxes now work for the Template Toolkit burner.

For a complete list of the changes, see the changes. For the complete history of ongoing changes in Bricolage, see Bric::Changes.

Download Bricolage 1.8.2 now from the Bricolage Website Downloads page, from the SourceForge download page, and from the Kineticode download page.

About Bricolage

Bricolage is a full-featured, enterprise-class content management and publishing system. It offers a browser-based interface for ease-of use, a full-fledged templating system with complete HTML::Mason, HTML::Template, and Template Toolkit support for flexibility, and many other features. It operates in an Apache/mod_perl environment and uses the PostgreSQL RDBMS for its repository. A comprehensive, actively-developed open source CMS, Bricolage was hailed as “quite possibly the most capable enterprise-class open-source application available” by eWEEK.

Looking for the comments? Try the old layout.

More about…

Apache::Util::escape_html() Doesn't Like Perl UTF-8 Strings

I got bit by a bug with Apache::Util’s escape_html() function in mod_perl 1. It seems that it doesn’t like Perl’s Unicode encoded strings! This patch demonstrates the issue (be sure that your editor understands utf-8):

--- modperl/t/net/perl/util.pl.~1.18.~  Sun May 25 03:54:08 2003+++ modperl/t/net/perl/util.pl  Thu Sep  9 19:38:40 2004@@ -74,6 +74,25 @@  #print $esc_2; test ++$i, $esc eq $esc_2;++# Make sure that escape_html() understands multibyte characters.+my $utf8 = '<專輯>';+my $esc_utf8 = '<專輯>';+my $test_esc_utf8 = Apache::Util::escape_html($utf8);+test ++$i, $test_esc_utf8 eq $esc_utf8;+#print STDERR "Compare '$test_esc_utf8'\n     to '$esc_utf8'\n";++eval { require Encode };+unless ($@) {+    # Make sure escape_html() properly handles strings with Perl's+    # Unicode encoding.+    $utf8 = Encode::decode_utf8($utf8);+    $esc_utf8 = Encode::decode_utf8($esc_utf8);+    $test_esc_utf8 = Apache::Util::escape_html($utf8);+    test ++$i, $test_esc_utf8 eq $esc_utf8;+    #print STDERR "Compare '$test_esc_utf8'\n     to '$esc_utf8'\n";+}+ use Benchmark;  =pod

If I enable the print statements and look at the log, I see this:

Compare '<專輯>'
     to '<專輯>'
Compare '<å°è¼¯>'
     to '<專輯>'

The first escape appears to work correctly, but when I decode the string to Perl’s Unicode representation, you can see how badly escape_html() munges the text!

Curiously, both tests fail, although the first conversion appears to be correct. This could be due to the behavior of eq, though I’m not sure why. But it’s the second test that’s the more interesting, since it really screws things up.

Looking for the comments? Try the old layout.

More about…

Introduction to Bricolage Published by Perl.com

Perl.com has published the first in a series of articles I’ve promised to write, “Content Management with Bricolage.” The article is targeted at organizational decision makers who need to evaluate Bricolage as part of their selection of a content management solution. So I’ve spent some time discussing what content management is and how Bricolage fits in. I mention a lot of other CMSes as for comparative purposes, to try to really give people a feel for how Bricolage can meet their needs.

I also go over some of Bricolage’s most important features, such as multisite management, document modeling, Perl templating, meaningful URLs, workflow, and any number of outputs via “output channels.” The article concludes with a list of sites known to be managed by Bricolage, as well as the promise for more.

The next article in the series will cover installation and configuration. Subsequent articles will cover document analysis and modeling, Mason templating, and fun with the SOAP server. Enjoy!

Looking for the comments? Try the old layout.

Another Short Visit to Alamosa

Stellar clouds
Shot over the prop engine Big, angry clouds Crop circles Roads or dry creeks? Shot under the wing Clouds over the wing The Clarion's lobby

I made another brief trip to Alamosa last week. This is the first chance I’ve had to writ anything about it! I was doing two days of training for Adams State College, where they’re implementing a campus-wide content management solution built on Bricolage. But the real reason I’m writing is because of the neat photos I took. Most of them I took from the plane on the way in to Alamosa. But the photo of the lobby of the Clarion hotel I stayed in is, um, interesting as well. Enjoy!

Looking for the comments? Try the old layout.

Nicholas Clark

Nick Clark goes Wild!

I just had to share this lovely picture of Nick Clark, taken on the Friday night of OSCon 2004 at Matt Sergeant’s party. I honestly have no idea what Nick was doing, but it was worth it for the photo, don’t you think?

Looking for the comments? Try the old layout.

Always use the C Locale with PostgreSQL

I ran into the weirdest bug with Bricolage today. We use the LIKE operator to do string comparisons throughout Bricolage. In one usage, the code checks to see if there’s a record in the “keyword” table before creating it. This is because keyword names are unique. So it looks for a keyword record like this:

SELECT name, screen_name, sort_name, active
  FROM   keyword
 WHERE  LOWER(name) LIKE ?

If it finds a keyword, it creates a relationship between it and a story document. If it doesn’t find it, it creates a new keyword record and then associates the new keyword with a story document.

However, one of our customers was getting SQL errors when attempting to add keywords to a story, and it took me a while to figure out what the problem was. This is because I couldn’t replicate the problem until I started trying to create multibyte keywords. Now, Bricolage uses a UTF-8 PostgreSQL database, but something very odd was going on. When I attempted to add the keyword “북한의”, it didn’t find an existing keyword, but then threw an error when the unique index thought it existed already! Running tests in psql, I found that = would find the existing record, but LIKE wouldn’t!

Once I posted a query on the pgsql-general list, someone noticed that the record returned when using = actually had a different value than was actually queried for. I had searched for “북한의”, but the database found “국방비”. It seems that = compares bytes, while LIKE compares characters. The error I was getting meant that the unique index was also using bytes. And because of the locale used when initdb was run, PostgreSQL thought that they actually were the same!

The solution to this problem, it turns out, was to dump the database, shut down PostgreSQL, move the old data directory, and create a new one with initdb -locale=C. I then restored the database, and suddenly = and LIKE (and the unique index) were doing the same thing. Hallelujah!

Naturally, I’m not the first to notice this issue. It’s particularly an issue with RedHat Linux installations, since RedHat has lately decided to set a system-wide locale. In my case, it was “en_US.UTF-8.” This apparently can break collations in other languages, and this affects indices, of course. So I was led to wonder if initdb shouldn’t default to a locale of C instead of the system default. What do you think?

You can read the whole thread here.

Looking for the comments? Try the old layout.

Portland Kerry Rally

Julie and I just got back from the Kerry rally at the Tom McCall Waterfront Park in Portland, OR. According to the Kerry Blog, there were ca. 60,000 people at the rally. Julie and I waited till the last minute to go, and for a while there thought we wouldn’t get in. But we did, and heard the second half of Kerry’s speech. As we made our way through the city afterward, we overheard some other folks saying they’d arrived at 8:30 and never got in. We felt very fortunate. I think it was just dumb luck to have found the entrance we did.

We were pretty close to the stage, too. We were off to the right out of the frame of this picture, but still only 30m or so from the stage. We could see Kerry quite clearly from there. It was interesting to see him in person; he was quite lively in addressing the crowd, and clearly engaged in what he was doing. He seemed to be having a good time, too. But I couldn’t help wondering if he and the other speakers didn’t occasionally feel silly up there, making the same speech with the same gestures over and over. Especially at the end, when Kerry shakes his fist in the air like a champion boxer and points out various groups of people for him and Teresa to wave to. But then again, maybe I’m just too jaded myself.

Still, it was interesting to be there in person and to see him working in person. It gave me much more of the impression that we’re dealing with a real person here, rather than just a talking head like you might see on TV. Here’s a guy who might soon hold what is arguably the most powerful political office in the world, and really, he’s just a regular guy trying to do some good, out there talking to anyone who will listen about how he wants to make things different than they have been. He’s a guy you could talk to, and talk to about the issues.

I got this impression from a rally with 60,000 people? Yeah, maybe I’m just nuts.

Highlight of the speech (what we heard of it) for Julie and me: Kerry’s plan to invest much more in alternative energy, to make America energy independent by 2020. That’s a plan I can very much get behind! I also appreciated his saying that he would never send US troops into action unless there was no alternative. The Iraq war is such a clusterfuck in so many ways; I really hope that things will change when Kerry is sworn into office.

But even if they don’t change that much, or not for a while, I would love to be able to have complaints about the Presidential administration more like I had about the Clinton White House. I’d rather be worried that my President was too close to the middle and conciliatory than that he was so far to the right as to be, well, radical.

I will do my part to see to it that Kerry gets the chance to disappoint me as a highly preferable alternative to the current state of complete mortification.

Looking for the comments? Try the old layout.

eWeek Reviews Bricolage 1.8.1

I can’t believe I haven’t posted this story here yet! I guess I’ve been busy. So here it is:

eWeek has reviewed Bricolage, the Perl-powered, PostgreSQL-backed open-source content management system. The article was published last week. An excerpt:

Bricolage is quite possibly the most capable enterprise-class open-source application available. The Web content management application features excellent administration capabilities, and it is highly extensible and capable of managing even the biggest and most complex Web sites. As an open-source product, Bricolage is free, and companies can now purchase support and development services from Kineticode.

The article is part of the “Content Management Face-Off” in the current issue of eWeek:

Included in this evaluation are the open-source Bricolage 1.8.1, Interwoven Inc.’s TeamSite 6.1, CrownPeak Technology Inc.’s Advantage CMS, Serena Software Inc.’s Collage 4.5, PaperThin Inc.’s CommonSpot Content Server 4.0 and Ektron Inc.’s CMS300 4.5. (The reviews are ordered, roughly, from the high end to the low end of the content management market.)

I’m pretty stoked about this review, as you might imagine. eWeek is now officially my favorite trade magazine!

Looking for the comments? Try the old layout.

More about…

OSCON 2004 Notes

I’m finally getting round to typing up my thoughts on my OSCon 2004 experience. I would’ve done it sooner, but I spent most of last week on the road and fixing bugs in Bricolage.

OSCon 2004 was, in a word, great! I spent every day of the week there, getting there around 8:30 each morning, and finally leaving the hotel or a party each night somewhere between midnight and 3 am. I was even there late on Sunday night, talking to folks who just came in, and late on Friday night, at a party in Matt Sergeant’s room. It was great to see so many friends there, including Casey, Schwern Jesse, Nat, Bruce, Josh, David, Elein, Dan, Nicholas, James, Arthur, Robert, Ask and Vani, my brother, Alex, and probably lots of other people I’m forgetting about.

There were more conversations between members of different communities than I can recall seeing at past OSCons, and people were generally excited and engaged. I’m told that they had the highest number of attendees since 2001. The energy at the conference was very positive, and people seemed very interested in things that other people were doing. Some of the highlights for me:

PHP on Parrot

Speakers Sterling Hughes and Thies C. Arntzen talked about how amped they are at the idea of poring PHP to run on Parrot, the virtual machine being developed for Perl 6 and other dynamic languages. The session ended up as a conversation between Sterling and Thies, on the one hand, and Larry Wall and Dan Sugalski, who were sitting in the front row, on the other. Larry assured them that any programming language community’s members would be “first-class citizens” in the Parrot world, and Dan told them that all they need do is ask for things they need and the Parrot developers would help as much as they could. Sterling wrapped up by saying something like, “I guess the real reason we’re so excited about Parrot is because we really love Perl!” That got a good laugh.

PostgreSQL

There was a bigger PostgreSQL presence than ever at OSCon this year, with lots of great discussion. There seemed to be quite a few Perl folks going to the PostgreSQL sessions, too. Dan Sugalski was suitably impressed with what’s coming up in PostgreSQL 8.0 (formerly 7.5) that he told me that he was moving up his plans for implementing pl/Parrot. A few of the core PostgreSQL folks said that they felt like people were finally being more open and exited about their use of PostgreSQL, rather than keeping quiet about this “strategic advantage.” And the features in 8.0 sound extremely promising, including Win32 support, save points/nested transactions, point-in-time recovery, tablespaces, and pl/Perl. It’s going to be a kick-ass release, no doubt about it. Watch for the beta this week.

SQLite

SQLite is fast, ACID-compliant, relational database engine in a public-domain C library. It’s great for embedding into an application because it’s not a client-server application, but a simple library that stores databases in files. It’s twice as fast as MySQL or PostgreSQL because it doesn’t have the client/server overhead, and its extremely portable. Version 3.0 adds UTF-8 and UTF-16, which makes it a real possibility for use in Bricolage 2.0 (for small installations and demo servers, for example).

I was pretty amazed at what this little database can do, and not only is it open-source, but because it is in the public domain, there are no constraints on its use. It’s just one sexy library. Everybody run out and use it now! Perl users get it for free by installing DBD::SQLite from CPAN.

Pie

A year later, Dan lost the bet with Guido, and gave him a case of beer, ten bucks, and the right to put pie in his face. Dan even made two key-lime pies for the occasion! At the Python lightening talks, Guido graciously declined to pie Dan. The Pythoners seemed to think that this was very nice of Guido, but the Perlers in the audience (including yours truly), were shouting, “Get him! Give him the pie! Do it, Guido!”. As Allison commented later, it’s nice how “the Perl community takes care of its own.”

Dan later auctioned off the right for someone else to pie him in the face. Schwern ponied (heh) up the cash, a ca. $500 donation to the Perl foundation for the right, but gave it to Ponie developer Nicholas to enjoy. The event came off just ahead of the final keynote. This time Guido decided to go ahead, and he doused Dan in cream pie. Then Nicholas came out and gave Dan the dessert, so to speak. Great fun for all.

The upshot, according to Dan, is that Guido wrote a really evil test suite with seven tests exercising 75% of Python’s ops. Of the seven tests, Dan got 4 working on Parrot, and 3 of those were 2-3 times faster than on Python. Things look very good indeed for Parrot going forward. Look for the tests to be fully working on Parrot (and fast!) in the next few months.

There were parties and conversations every night, lots of great talk, good food, good friends, and, well, I just had a great time. I can’t wait until next year’s OSCon!

Looking for the comments? Try the old layout.

More about…