Just a Theory

By David E. Wheeler

Posts about grep

Stepped Series of Numbers in Perl

In working on a Perl validation function for GTINs (recipe here), I found a need to generate a series of numbers with a step of two. For example, I in the series 1-10, I first want 1, 3, 5, 7, and 9. And then later I want 2, 4, 6, 8, 10. Here’s how I went about creating those series in my GTIN function to create hash slices:

sub isa_gtin {
    my @nums = reverse split q{}, shift;
    (
        sum( @nums[ grep {   $_ % 2  } 0..$#nums ] ) * 3
        + sum( @nums[ grep { !($_ % 2) } 0..$#nums ] )
    ) % 10 == 0;
}

But it seems wasteful to generate the series of numbers twice and to calculate whether they’re odd or even twice. Surely there’s a more efficient way to do this in Perl, perhaps even more expressive? Python seems to have a useful syntax for creating array slices that step. In Python, I’d do something like this:

sum( nums[1:10:2] ) * 3 + sum( nums[2:10:2])

But barring such a slice feature in Perl is there some cleaner way than the ugly grep approach I created to generate a stepped series in Perl?

Looking for the comments? Try the old layout.

More about…

Efficient Closest Word Algorithm

“Perl Best Practices” cover

I’ve been reading Perl Best Practices and have been making use of List::Util and List::MoreUtils as a result. I’m amazed that I never knew about these modules before. I mean, I kinda knew there were there, but hadn’t paid much attention before or bothered to find out how useful they are!

Anyway, a problem I’m currently working on is finding a word in a list of words that’s the closest match to another word. Text::Levenshtein appears to be a good method to determine relative closeness, but try as I might, I couldn’t make it work using first or min or apply or any of the utility list methods. I finally settled on this subroutine:

use Text::LevenshteinXS qw(distance);
sub _find_closest_word {
    my ($word, $closest) = (shift, shift);
    my $score = distance($word, $closest);
    for my $try_word (@_) {
        my $new_score = distance($word, $try_word);
        ($closest, $score) = ($try_word, $new_score)
            if $new_score < $score;
    }
    return $closest;
}

Am I missing something, or is this really the most obvious and efficient way to do it?

Looking for the comments? Try the old layout.