Just a Theory

By David E. Wheeler

Efficient Closest Word Algorithm

“Perl Best Practices” cover

I’ve been reading Perl Best Practices and have been making use of List::Util and List::MoreUtils as a result. I’m amazed that I never knew about these modules before. I mean, I kinda knew there were there, but hadn’t paid much attention before or bothered to find out how useful they are!

Anyway, a problem I’m currently working on is finding a word in a list of words that’s the closest match to another word. Text::Levenshtein appears to be a good method to determine relative closeness, but try as I might, I couldn’t make it work using first or min or apply or any of the utility list methods. I finally settled on this subroutine:

use Text::LevenshteinXS qw(distance);
sub _find_closest_word {
    my ($word, $closest) = (shift, shift);
    my $score = distance($word, $closest);
    for my $try_word (@_) {
        my $new_score = distance($word, $try_word);
        ($closest, $score) = ($try_word, $new_score)
            if $new_score < $score;
    }
    return $closest;
}

Am I missing something, or is this really the most obvious and efficient way to do it?

Looking for the comments? Try the old layout.