Just a Theory

By David E. Wheeler

Posts about Programming

The 411 Since Graduating from College

I recently got back in touch with a friend from college via Facebook. She asked me, “So David give me the 411? Whats been up with you for oh? 15 years?” Facebook’s Wall doesn’t seem to care much for multi-paragraph posts, but it kind of makes sense to post it in my blog anyway.

Julie and I moved to Florida in January, 1994 for a few months, and to Virginia the following summer. I started in the graduate program in the UVa Department of Anthropology in the fall. I also got on the internet that year and started learning how to program. We got married in May, 1995, in Orange, Virginia.

Two years later, I got my MA. Even though I was at UVa doing Near Eastern archaeology, by masters paper was based on research in the American Southwest. That’s just the way things shook out. The paper was later rejected by an archaeology journal. The peer reviews were really offensive, one in particular; some of the old guard of Southwest archaeology were really threatened by it. Didn’t help that I’d dropped the research part of the article before submitting. I was advised to do so, but it was clearly a mistake. C’et la vie. I mostly found it humorous and typical that academics could be such dicks to a student submitting his first peer-reviewed paper.

I have a PDF of the paper I keep meaning to blog. I should do that one of these days.

I spent a summer on Cyprus excavating a medieval site and the summer of 98 with my advisor for four weeks in southeastern Turkey. Kurdistan, really. My focus was supposedly architecture and urbanization, but in truth I enjoyed creating a database app for the project much more than counting pottery sherds. I went into the Turkey trip thinking it would determine whether or not I stuck to archaeology. I’d by this time had a full-time job for about a year doing systems and integration programming for the UVa medical center. It was fun, engaging work, and although I enjoyed the academic side of graduate schools (seminars and such), the culture of academia held no interest for me at all.

So I quit the program when I got back from Turkey. In 1999 we moved back to SF. I worked for UCSF for 9 months, then went to work for Salon.com. I was there a year, then went on my own, working on an open-source content management system called Bricolage that I’d developed with my colleagues at Salon. Life was great for us in SF. We moved into a loft in 2002 and really made the best of our time in The City.

In 2003 we were visiting Portland for a weekend just after Christmas and decided to have a real estate agent show us some properties to get a feel for the place. We’d been thinking about moving to Portland since ’96, and were still thinking maybe we’d do it in a couple more years. Julie’s dad had moved to Eugene, 2 hours down the road, so that was also a factor. To our surprise, we found a house we fell in love with. So we bought it, sold the loft, and moved to Portland, arriving in April, 2004. Our daughter, Anna, was born in May 2005.

And the rest is history. I’ve done a bunch of technology-related work over the last 10 years, mostly Perl and PostgreSQL programming. These days, I do PostgreSQL consulting as an associate in PostgreSQL Experts, some Bricolage consulting via my company, Kineticode, and have recently started a new venture with a friend to develop iPad app.

Portland is a terrific place to live. We love it here. Not gritty like SF, but still with the elements of urban living. We have a house close to downtown and I get around mainly by bike. Anna is doing great; she’s so awesome. She’s in a Montessori school that we’ll likely keep her in through 8th grade.

Julie is doing well, too. At UVa she became Art Director for the University’s Capital Campaign, and started a business, Strongrrl, while in San Francisco, mainly focused on graphic design for universities and non-profits. Business has slowed in the last few years, alas, as print has been dying and budgets have become restricted. She still does a bit of work, but also has started sewing and an Etsy store (kind of empty at the moment, will be stocked in the next couple of weeks) and this year doing deep genealogical research. We both work at home, but she does the lion’s share of the domestic and child-rearing duties. After 18 years together our relationship has deepened tremendously. We’re very happy together.

Anyway, life is good. I suppose if I were to write this again tomorrow I’d focus on a bunch of other things. A lot happens in 17 years, as you no doubt know. This is just a thin slice, with more academic stuff than I usually go into, but the context seemed to warrant it.

So what’s your 411?

Looking for the comments? Try the old layout.

Fuck Typing LWP

I’m working on a project that fetches various files from the Internet via LWP. I wanted to make sure that I was a polite user, such that my app would pay attention to Last-Modified/If-Modified-Since and ETag/If-None-Match headers. And in most contexts I also want to respect the robots.txt file on the hosts to which I’m sending requests. So I was very interested to read chromatic’s hack for this very issue. I happily implemented two classes for my app, MyApp::UA, which inherits from LWP::UserAgent::WithCache, and MyApp::UA::Robot, which inherits from MyApp::UA but changes LWP::UserAgent::WithCache to inherit from LWP::UARobot:

@LWP::UserAgent::WithCache::ISA = ('LWP::RobotUA');

So far so good, right? Well, no. What I didn’t think about, stupidly, is that by changing LWP::UserAgent::WithCache’s base class, I was doing so globally. So now both MyApp::UA and MyApp::UA::Robot were getting the LWP::RobotUA behavior. Urk.

So my work around is to use a little fuck typing to ensure that MyApp::UA::Robot has the robot behavior but MyApp::UA does not. Here’s what it looks like (BEWARE: black magic ahead!):

package MYApp::UA::Robot;

use 5.12.0;
use utf8;
use parent 'MyApp::UA';
use LWP::RobotUA;

do {
    # Import the RobotUA interface. This way we get its behavior without
    # having to change LWP::UserAgent::WithCache's inheritance.
    no strict 'refs';
    while ( my ($k, $v) = each %{'LWP::RobotUA::'} ) {
        *{$k} = *{$v}{CODE} if *{$v}{CODE} && $k ne 'new';
    }
};

sub new {
    my ($class, $app) = (shift, shift);
    # Force RobotUA configuration.
    local @LWP::UserAgent::WithCache::ISA = ('LWP::RobotUA');
    return $class->SUPER::new(
        $app,
        delay => 1, # be very nice -- max one hit per minute.
    );
}

The do block is where I do the fuck typing. It iterates over all the symbols in LWP::RobotUA, inserts a reference to all subroutines into the current package. Except for new, which I implement myself. This is so that I can keep my inheritance from MyApp::UA intact. But in order for it to properly configure the LWP::RobotUA interface, new must temporarily fool Perl into thinking that LWP::UserAgent::WithCache inherits from LWP::RobotUA.

Pure evil, right? Wait, it gets worse. I’ve also overridden LWP::RoboUA’s host_wait method, because if it’s the second request to a given host, I don’t want it to sleep (the first request is for the robots.txt, and I see no reason to sleep after that). So I had to modify the do block to skip both new and host_wait:

    while ( my ($k, $v) = each %{'LWP::RobotUA::'} ) {
        *{$k} = *{$v}{CODE} if *{$v}{CODE} && $k !~ /^(?:new|host_wait)$/;
    }

If I “override” any other LWP::RobotUA methods, I’ll need to remember to add them to that regex. Of course, since I’m not actually inheriting from LWP::RobotUA, in order to dispatch to its host_wait method, I can’t use SUPER, but must dispatch directly:

sub host_wait {
    my ($self, $netloc) = @_;
    # First visit is for robots.txt, so let it be free.
    return if !$netloc || $self->no_visits($netloc) < 2;
    $self->LWP::RobotUA::host_wait($netloc);
}

Ugly, right? Yes, I am an evil bastard. “Fuck typing” is right, yo! At least it’s all encapsulated.

This just reinforces chromatic’s message in my mind. I’d sure love to see LWP reworked to use roles!

Looking for the comments? Try the old layout.

Doomed To Reinvent

There’s an old saying, “Whoever doesn’t understand X is doomed to reinvent it.”X can stand for any number of things. The other day, I was pointing out that such is the case for ORM developers. Take ActiveRecord, for example. As I demonstrated in a 2007 Presentation, because ActiveRecord doesn’t support simple things like aggregates or querying against functions or changing how objects are identified, you have to fall back on using its find_by_sql() method to actually run the SQL, or using fuck typing to force ActiveRecord to do what you want. There are only two ways to get around this: Abandon the ORM and just use SQL, or keep improving the ORM until it has, in effect, reinvented SQL. Which would you choose?

I was thinking about this as I was hacking on a Drupal installation for a client. The design spec called for the comment form to be styled in a very specific way, with image submit buttons. Drupal has this baroque interface for building forms: essentially an array of arrays. Each element of the array is a form element, unless it’s markup. Or something. I can’t really make heads or tails of it. What’s important is that there are a limited number of form elements you can create, and as of Drupal 5, *image* isn’t fucking one of them!.

Now, as a software developer, I can understand this. I sometimes overlook a feature when implementing some code. But the trouble is: why have some bizarre data structure to represent a subset of HTML when you have something that already works: it’s called HTML. Drupal, it seems, is doomed to reinvent HTML.

So just as I have often had to use find_by_sql() as the fallback to get ActiveRecord to fetch the data I want, as opposed to what it thinks I want, I had to fallback on the Drupal form data structure’s ability to accept embedded HTML like so:

$form['submit_stuff'] = array(
  '#weight' => 20,
  '#type'   => 'markup',
  '#value'  => '<div class="form-submits">'
              . '<label></label><p class="message">(Maximum 3000 characters)</p>'
              . '<div class="btns">'
              . '<input type="image" value="Preview comment" name="op" src="preview.png" />'
              . '<img width="1" height="23" src="divider.png" />'
              . '<input type="image" value="Post comment" name="op" src="post.png" />'
              . '</div></div>',
);

Dear god, why? I understand that you can create images using an array in Drupal 6, but I fail to understand why it was ever a problem. Just give me a templating environment where I can write the fucking HTML myself. Actually, Drupal already has one, it’s called PHP!. Please don’t make me deal with this weird hierarchy of arrays, it’s just a bad reimplementation of a subset of HTML.

I expect that there actually is some way to get what I want, even in Drupal 5, as I’m doing some templating for comments and pages and whatnot. But that should be the default IMHO. The weird combining of code and markup into this hydra-headed data structure (and don’t even get me started on the need for the #weight key to get things where I want them) is just so unnecessary.

In short, if it ain’t broke, don’t reinvent it!

</rant>

Looking for the comments? Try the old layout.

Fuck Typing

chromatic’s post on Perl Roles reminded me that I’ve wanted for some time to blog about another kind of composition. I call it “fuck typing.” It’s kind of like duck typing, only not really. I would explain, but I think that my good friend, Mr. Vinnie Goombatz, will do a much better job. Although if you’re squeamish or easily offended, you might want to skip it.

How you doin’? Theory aksed me to talk about fuck typing. It’d be my fuckin’ pleasure.

You know how sometimes you’re hacking (I love that word, “hacking”) some piece-a shit code, and you’re using some cacasenno’s module, but it doesn’t quite do what you fuckin’ want it to do?

Here’s what you say, you say, “Oh, you don’t fuckin’ want to gimme a fuckin’ prosciutto method? You got a fuckin’ problem? ‘Cause you’re about to have a fuckin’ problem, know what I’m sayin’?”

I tellya what ya gonna do. You gonna fuckin’ open up that fuckin’ paisano’s module, right there, just fuckin’ cut it right open, and then you gonna fuckin’ shove the a prosciutto method right into the module’s fuckin’ guts. “How do you like them apples, you fuckin’ piece of shit?”

And that’s what you do. You fuckin’ show him who’s boss, know what I’m sayin’? If you don’t get the fuckin’ interface you need, you fuck the module up until you get it. Ain’t no big fuckin’ deal. Nice doin’ biznizz wit’chou.

What’s surprising to me is how accepted this sort of bad behavior is in some communities. Oh, well, there are all kinds, I guess.

Looking for the comments? Try the old layout.

My First C: A GTIN Data Type for PostgreSQL

After all of my recent experimentation creating UPC, EAN, and GTIN validation functions, I became interested in trying to create a GTIN PostgreSQL data type in C. The fact that I don’t know C didn’t stop me from learning enough to do some damage. And now I have a first implementation done. Check it out!

So how did I do this? Well, chapter six of the Douglas Book was a great help to get me started. I also learned what I could by reading the source code for the core and contributed PostgreSQL data types, as well as the EnumKit enumerated data type builder (download from here). And the denizens of the #postgresql channel on FreeNode were also extremely helpful. Thank you, guys!

I would be very grateful if the C hackers among you, and especially any PostgreSQL core hackers who happen to read my blog, would download the GTIN source code and have a look at it. This is the first C code I’ve written, so it would not surprise me if there were some gotchas that I missed (memory leaks, anyone?). And yes, I know that the new ISN contributed data types in the forthcoming 8.2 is a far more featureful implementation of bar code data types; I learned about it after I had nearly finished this first release of GTIN. But I did want to learn some C and how to create PostgreSQL data types, and provide the code for others to learn from, as well. It may also end up as the basis for an article. Stay tuned

In the meantime, share and enjoy.

Update: I forgot to mention that you can check out the source code from the Kineticode Subversion repository.

Looking for the comments? Try the old layout.

Software Development Methodology

I feel that it’s important to have a comprehensive approach to software development. It’s not enough to be good at coding, or testing, or writing documentation. It’s far better to excel at managing every step of the development process in order to ensure the quality and consistency of the end-to-end work as well as of the final product. I aim to do just that in my work. Here I briefly outline my methodology for achieving that aim.

First, good software development starts with good planning and research. I strive to attain a thorough understanding of what I’m developing by listening to the people to whom it matters most: the users. By gaining insight into how people in the target market think about the problem space, and by strategizing about how technology can address that space, a picture of the product takes shape. This research coalesces into a set of pragmatic requirements and goals that balance the demands of a realistic development schedule with the needs and desires of the target market.

Once the requirements have been identified, it’s time for prototyping. Task flow diagrams of user interactions model the entire system. Evaluations from the target market refine these schematics, shaping the look and feel of the final product. I cannot emphasize enough the importance of seeking market feedback to build solid and meaningful metaphors into the design. These concepts drive the user experience and make or break the success of the final product. The outcome of this feedback loop will be a UI, terminology, and object design grounded on intuitive concepts, scalable technologies, and a reliable architecture.

Next, a talented development team must be assembled and backed by a dependable, project management-oriented implementation infrastructure. Team-building is crucial for the success of any product, and in software development, a diverse set of engineers and specialists with complementary talents must come together and work as an efficient whole. As a result, I consider it extremely important to create a working culture of which team members want to be a part. Such an environment doesn’t foster a sense of entitlement, but rather of conviviality and excitement. If team members believe in what they’re doing, and they enjoy doing it, then they’re likely to do it well.

And what they’ll do is actually create the software. Each element of the product design must be broken down into its basic parts, fit into a generalizable design, and built back up into meaningful objects. I further require detailed documentation of every interface and implementation, as well as thorough unit testing. In fact, the tests are often written before the interfaces are written, ensuring that they will work as expected throughout the remainder of the development process. All aspects of the application must be implemented according to a scalable, maintainable methodology that emphasizes consistency, quality, and efficiency.

The emphasis on quality naturally continues into the quality assurance phase of the development process. The feature set is locked so that development engineers can work closely with QA engineers to test edge conditions, identify bugs, fix them, and ensure that they remain fixed. I prefer to have QA engineers punish nightly builds with suites of tests while development engineers fix the problems identified by previous days’ tests. QA is considered complete when the product passes all the tests we can dream up.

And finally, once all of the QA issues have been addressed, the final product is delivered. Naturally, the process doesn’t stop there, but starts over – in fact, it likely has already started over. New features must be schematically tested with likely users, and new interfaces designed to implement them. The idea is to end up with a solid product that can grow with the needs of the target market.

Looking for the comments? Try the old layout.