Just a Theory

By David E. Wheeler

Posts about RSS

Atom Sources

I’m working on a project where I aggregate entries from a slew of feeds into a single feed. The output feed will be a valid Atom feed, and of course I want to make sure that I maintain all the appropriate metadata for each entry I collect. The <source> element seems to be exactly what I need:

If an entry is copied from one feed into another feed, then the source feed’s metadata (all child elements of feed other than the entry elements) should be preserved if the source feed contains any of the child elements author, contributor, rights, or category and those child elements are not present in the source entry.

<source>
  <id>http://example.org/</id>
  <title>Fourty-Two</title>
  <updated>2003-12-13T18:30:02Z</updated>
  <rights>© 2005 Example, Inc.</rights>
</source>

That’s perfect: It allows me to keep the title, link, rights, and icon of the originating blog associated with each entry.

Except, maybe it’s the database expert in me, but I’d like to be able to have it be more normalized. My feed might have 1000 entries in it from 100 sources. Why would I want to dupe that information for every single entry from a given source? Is there now better way to do this, say to have the source data once, and to reference the source ID only for each entry? That would make for a much smaller feed, I expect, and a lot less duplication.

Is there any way to do this in an Atom feed?

Looking for the comments? Try the old layout.

Teasers Only Atom Feed

Select a feed

I’ve just added a new feed: teasers only. It makes things a log shorter for those who just want to get a teaser for each blog entry, rather than complete entries, such as Planet Perl and Planet PostgreSQL.

Any questions or problems? Leave a comment. Thanks!

Looking for the comments? Try the old layout.

More about…

Has Google Forgotten its on Tagline?

My friend Chad Dickerson, the exiting CTO of Infoworld, has blogged about a recent move by Google to patent advertising in RSS!

Incorporating targeted ads into information in a syndicated, e.g., RSS, presentation format in an automated manner is described. Syndicated material e.g., corresponding to a news feed, search results or web logs, are combined with the output of an automated ad server. An automated ad server is used to provide keyword or content based targeted ads. The ads are incorporated directly into a syndicated feed, e.g., with individual ads becoming items within a particular channel of the feed.

This despite the fact that InfoWorld was itself sending targeted ads out in is RSS feeds before Google filed for its patent! Is this another one-click debacle in the making? Does it really make any sense to patent delivering targeted ads over HTTP just because they’re in XML instead of HTML?

What do you think?

Looking for the comments? Try the old layout.

On Making a Better Open-Source CMS

Jeffrey Veen posted some of his thoughts on the (dreary) state of open-source CMSs. So I just thought I’d comment on his thoughts. Keep your salt grains handy.

As a CMS developer, my point of view is quite naturally biased. However, I agree with some of Jeffrey’s points, and disagree with others. Well, not disagree so much as wish to qualify. Many of your points betray a certain perspective that does not (and, naturally, cannot) apply to anyone and everyone who is evaluating content management systems. So let me just try to address each of your points and how they related to Bricolage.

Make it easy to install. Well, yes, of course, and I’ll be the first to admit that Bricolage is difficult to install. But your requirement that you be able to install it from the browser just isn’t feasible with a CMS that aims to scale to the needs of a large organization. The security implications alone make give me the heebee-jeebies. It’s fine if you want to just manage your own personal site, but not if you’re aiming to serve the complex needs of the corporate marketplace.

That said, it might be reasonable to create a simple installer that’s useful for doing a local evaluation of a major CMS, one that doesn’t rely on an RDBMS and an Apache Web server installation. (RT (not a CMS) has been working on a simple executable that uses an embedded database and Web server for those who want to evaluate it. For Bricolage, we at one time had a KNOPPIX CD that one could use to try it out. But for the rest, the best solution is probably an RPM, BSD Package, Debian package, or the like–something that can integrate the application into the base operating system.

Make it easy to get started. Bricolage is pretty bad about this, mostly because the default templates that come with it suck. That will change eventually, but the bigger issue is that when you have a complex, flexible application, it’s tricky to present a simple getting started configuration without locking the user into just using that configuration (witness all the identical Microsoft Home Page sites on the ‘Net to see what I mean). But that’s no excuse for a system like Bricolage–those who need the more advanced features could take advantage of them when they’re ready. It’s just a matter of finding the tuits (or the volunteer) to make it happen.

Write task-based documentation first. In my experience, most complex open-source applications that have task-based documentation have it when they author a book. Yes, most of these systems have grown organically, and documentation gets written as volunteers make the time. But the best documentation I’ve found for open-source software has tended to be in published books. Though I think that trend is gradually changing.

Separate the administration of the CMS from the editing and managing of content. In Bricolage, you do not have to switch accounts to have access to the administrative tools. And although the administrative tools are part of the same UI, they have an entirely different section and set of navigation menus. Users who don’t have permission to use those tools don’t notice them.

Users of a public web site should never–never–be presented with a way to log into the CMS. Amen, brotha.

Stop it with the jargon already. Finding good terminology is hard, hard work. Bricolage is broken in this respect in a few ways, but I’m thinking of replacing “jobs” with “fembots” in 2.0. What do you think?

But seriously, we try to match the terms to what is commonly used, such as “document”, “site”, “category”, “keyword”, “template”, etc. Other terms, such as “element” (the parts of a document are its elements) are well-integrated into the system, so that users pick up on it very quickly.

Why do you insist Web sites have columns? I’m in complete agreement with you here. In Bricolage, you can write templates to output any kind of content you want, in any format you want. If you want columns, fine, generate them from Bricolage. If you want a standards-compliant layout, generate it from Bricolage. You can have 1998-era tables with Flash and you can have RSS feeds. Do what you want, make it flexible or make it complex.

But note that the flexibility comes at the price of complexity. And try as we might to make Bricolage “The Macintosh of Content Management Systems,” as long as the definition of the term “Content Management System” is all over the map, commonalities of metaphors, interfaces, and, well, philosophies between CMSs will continue to be all over the map.

But that’s just my opinion.

Looking for the comments? Try the old layout.