Just a Theory

Trans rights are human rights

Posts about Perl

Accelerate Perl Github Workflows with Caching

I’ve spent quite a few hours evenings and weekends recently building out a comprehensive suite of GitHub Actions for Sqitch. They cover a dozen versions of Perl, nearly 70 database versions amongst nine database engines, plus a coverage test and a release workflow. A pull request can expect over 100 actions to run. Each build requires over 100 direct dependencies, plus all their dependencies. Installing them for every build would make any given run untenable.

Happily, GitHub Actions include a caching feature, and thanks to a recent improvement to shogo82148/actions-setup-perl, it’s quite easy to use in a version-independent way. Here’s an example:

name: Test
on: [push, pull_request]
jobs:
  OS:
    strategy:
      matrix:
        os: [ ubuntu, macos, windows ]
        perl: [ 'latest', '5.34', '5.32', '5.30', '5.28' ]
    name: Perl ${{ matrix.perl }} on ${{ matrix.os }}
    runs-on: ${{ matrix.os }}-latest
    steps:
      - name: Checkout Source
        uses: actions/checkout@v3
      - name: Setup Perl
        id: perl
        uses: shogo82148/actions-setup-perl@v1
        with: { perl-version: "${{ matrix.perl }}" }
      - name: Cache CPAN Modules
        uses: actions/cache@v3
        with:
          path: local
          key: perl-${{ steps.perl.outputs.perl-hash }}
      - name: Install Dependencies
        run: cpm install --verbose --show-build-log-on-failure --no-test --cpanfile cpanfile
      - name: Run Tests
        env: { PERL5LIB: "${{ github.workspace }}/local/lib/perl5" }
        run: prove -lrj4

This workflow tests every permutation of OS and Perl version specified in jobs.OS.strategy.matrix, resulting in 15 jobs. The runs-on value determines the OS, while the steps section defines steps for each permutation. Let’s take each step in turn:

  • “Checkout Source” checks the project out of GitHub. Pretty much required for any project.
  • “Setup Perl” sets up the version of Perl using the value from the matrix. Note the id key set to perl, used in the next step.
  • “Cache CPAN Modules” uses the cache action to cache the directory named local with the key perl-${{ steps.perl.outputs.perl-hash }}. The key lets us keep different versions of the local directory based on a unique key. Here we’ve used the perl-hash output from the perl step defined above. The actions-setup-perl action outputs this value, which contains a hash of the output of perl -V, so we’re tying the cache to a very specific version and build of Perl. This is important since compiled modules are not compatible across major versions of Perl.
  • “Install Dependencies” uses cpm to quickly install Perl dependencies. By default, it puts them into the local subdirectory of the current directory — just where we configured the cache. On the first run for a given OS and Perl version, it will install all the dependencies. But on subsequent runs it will find the dependencies already present, thank to the cache, and quickly exit, reporting “All requirements are satisfied.” In this Sqitch job, it takes less than a second.
  • “Run Tests” runs the tests that require the dependencies. It requires the PERL5LIB environment variable to point to the location of our cached dependencies.

That’s the whole deal. The first run will be the slowest, depending on the number of dependencies, but subsequent runs will be much faster, up to the seven-day caching period. For a complex project like Sqitch, which uses the same OS and Perl version for most of its actions, this results in a tremendous build time savings. CI configurations we’ve used in the past often took an hour or more to run. Today, most builds take only a few minutes to test, with longer times determined not by dependency installation but by container and database latency.

Testing Perl Projects on Travis Windows

A few months ago, Travis CI announced early access for a Windows build environment. In the last couple weeks, I spent some time to figure out how to test Perl projects there by installing Strawberry Perl from Chocolatey.

The result is the the sample project winperl-travis. It demonstrates three .travis.yml configurations to test Perl projects on Windows:

  1. Use Windows instead of Linux to test multiple versions of Perl. This is the simplest configuration, but useful only for projects that never expect to run on a Unix-style OS.
  2. Add a Windows build stage that runs the tests against the latest version of Strawberry Perl. This pattern is ideal for projects that already test against multiple versions of Perl on Linux, and just want to make sure things work on windows.
  3. Add a build stage that tests against multiple versions of Strawberry Perl in separate jobs.

See the results of each of the three approaches in the CI build. A peek:

winperl-travis CI build results

The Travis CI-default “Test” stage is the default, and runs tests on two versions of Perl on Windows. The “Windows” stage tests on a single version of Windows Perl, independent of the “Test” stage. And the “Strawberry” stage tests on multiple versions of Windows Perl independent of the Test stage.

If, like me, you just want to validate that your Perl project builds and its tests pass on Windows (option 2), I adopted the formula in text-markup project. The complete .travis.yml:

language: perl
perl:
  - "5.28"
  - "5.26"
  - "5.24"
  - "5.22"
  - "5.20"
  - "5.18"
  - "5.16"
  - "5.14"
  - "5.12"
  - "5.10"
  - "5.8"

before_install:
  - sudo pip install docutils
  - sudo apt-get install asciidoc
  - eval $(curl https://travis-perl.github.io/init) --auto

jobs:
  include:
    - stage: Windows
      os: windows
      language: shell
      before_install:
        - cinst -y strawberryperl
        - export "PATH=/c/Strawberry/perl/site/bin:/c/Strawberry/perl/bin:/c/Strawberry/c/bin:$PATH"
      install:
        - cpanm --notest --installdeps .
      script:
        - cpanm -v --test-only .

The files starts with the typical Travis Perl configuration: select the language (Perl) and the versions to test. The before_install block installs a couple of dependencies and executes the travis-perl helper for more flexible Perl testing. This pattern practically serves as boilerplate for new Perl projects.

The new bit is the jobs.include section, which declares a new build stage named “Windows”. This stage runs independent of the default phase, which runs on Linux, and declares os: windows to run on Windows.

The before_install step uses the pre-installed Chocolatey package manager to install the latest version of Strawberry Perl and update the $PATH environment variable to include the paths to Perl and build tools. Note that the Travis CI Window environment runs inside the Git Bash shell environment; hence the Unix-style path configuration.

The install phase installs all dependencies for the project via cpanminus, then the script phase runs the tests, again using cpanminus.

And with the stage set, the text-markup build has a nice new stage that ensures all tests pass on Windows.

The use of cpanminus, which ships with Strawberry Perl, keeps things simple, and is essential for installing dependencies. But projects can also perform the usual gmake test1 or perl Build.PL && ./Build test dance. Install Dist::Zilla via cpanminus to manage dzil-based projects. Sadly, prove currently does not work under Git Bash.2

Perhaps Travis will add full Perl support and things will become even easier. In the meantime, I’m pleased that I no longer have to guess about Windows compatibility. The new Travis Windows environment enables a welcome increase in cross-platform confidence.


  1. Although versions of Strawberry Perl prior to 5.26 have trouble installing Makefile.PL-based modules, including dependencies. I spent a fair bit of time trying to work out how to make it work, but ran out of steam. See issue #1 for details. ↩︎

  2. I worked around this issue for Sqitch by simply adding a copy of prove to the repository. ↩︎

Adopt My Modules

Dear Perl Community,

Over the last 17 years, I’ve created, released, updated, and/or maintained a slew of Perl modules on CPAN. Recently my work has changed significantly, and I no longer have the time to properly care for them all. A few, like Pod::Simple and Plack::Middleware::MethodOverride have co-maintainers, but most don’t. They deserve more love than I can currently provide. All, therefore, are up for adoption.

If you regularly use my modules, use a service that depends on them, or just like to contribute the community, consider becoming a maintainer! Have a look at the list, and if you’d like to rescue an orphan module, hit me up via Twitter or email me at david at this domain.

Wanted: New SVN::Notify Maintainer

I’ve used Subversion very occasionally since 2009, and SVN::Notify at all. Over the years, I’ve fixed minor issues with it now and then, and made the a couple of releases to address issues fixed by others. But it’s past the point where I feel qualified to maintain it. Hell, the repository for SVN::Notify has been hosted on GitHub ever since 2011. I don’t have an instance of Subversion against which to test it; nor do I have any SMTP servers to throw test messages at.

In short, it’s past time I relinquished maintenance of this module to someone with a vested interest in its continued use. Is that you? Do you need to keep SVN::Notify running for your projects, and have a few TUITs to fix the occasional bug or security issue? If so, drop me a line (david @ this domain). I’d be happy to transfer the repository.

Please Test Pod::Simple 3.29_3

Pod Book

I’ve just pushed Pod-Simple 3.29_v3 to CPAN. Karl Williamson did a lot of hacking on this release, finally adding support for EBCDIC. But as part of that work, and in coordination with Pod::Simple’s original author, Sean Burke, as well as pod-people, we have switched the default encoding from Latin-1 to CP-1252.

On the surface, that might sound like a big change, but in truth, it’s pretty straight-forward. CP-1252 is effectively a superset of Latin-1, repurposing 30 or so unused control characters from Latin-1. Those characters are pretty common on Windows (the home of the CP family of encodings), especially in pastes from Word. It’s nice to be able to pick those up essentially for free.

Still, Karl’s done more than that. He also updated the encoding detection to do a better job at detecting UTF-8. This is the real default. Pod::Simple only falls back on CP1252 if there are no obvious UTF-8 byte sequences in your Pod.

Overall these changes should be a great improvement. Better encoding support is always a good idea. But it is a pretty significant change, including a change to the Pod spec. Hence the test release. Please make sure it works well with your code by installing it today:

cpan D/DW/DWHEELER/Pod-Simple-3.29_3.tar.gz
cpanm DWHEELER/Pod-Simple-3.29_3.tar.gz

Oh, and one last thing: If Pod::Simple fails to properly recognize the encoding in your Pod file, you can always use the =encoding command early in your Pod file to make it explicit:

=encoding CP1254

Build Modern Perl RPMs with rpmcpan

iovation + Perl = Love

We’ve been using the CentOS Perl RPMs at iovation to run all of our Perl applications. This has been somewhat painful, because the version of Perl, 5.10.1, is quite old — it shipped in August 2009. In fact, it consists mostly of bug fixes against Perl 5.10.0, which shipped in December 2007! Many of the modules provided by CentOS core and EPEL are quite old, as well, and we had built up quite the collection of customized module RPMs managed by a massive spaghetti-coded Jenkins job. When we recently ran into a Unicode issue that would best have been addressed by running a more modern Perl — rather than a hinky workaround — I finally sat down and knocked out a way to get a solid set of Modern Perl and related CPAN RPMs.

I gave it the rather boring name rpmcpan, and now you can use it, too. Turns out, DevOps doesn’t myopically insist on using core RPMs in the name of some abstract idea about stability. Rather, we just need a way to easily deploy our stuff as RPMs. If the same applies to your organization, you can get Modern Perl RPMs, too.

Here’s how we do it. We have a new Jenkins job that runs both nightly and whenever the rpmcpan Git repository updates. It uses the MetaCPAN API to build the latest versions of everything we need. Here’s how to get it to build the latest version of Perl, 5.20.1:

./bin/rpmcpan --version 5.20.1

That will get you a nice, modern Perl RPM, named perl520, completely encapsulated in /usr/local/perl520. Want 5.18 instead: Just change the version:

./bin/rpmcpan --version 5.18.2

That will give you perl518. But that’s not all. You want to build CPAN distributions against that version. Easy. Just edit the dists.json file. Its contents are a JSON object where the keys name CPAN distributions (not modules), and the values are objects that customize our RPMs get built. Most of the time, the objects can be empty:

{
    "Try-Tiny": {}
}

This results in an RPM named perl520-Try-Tiny (or perl518-Try-Tiny, etc.). Sometimes you might need additional information to customize the CPAN spec file generated to build the distribution. For example, since this is Linux, we need to exclude a Win32 dependency in the Encode-Locale distribution:

{
    "Encode-Locale": { "exclude_requires": ["Win32::Console"] }
}

Other distributions might require additional RPMs or environment variables, like DBD-Pg, which requires the PostgreSQL RPMs:

{
    "DBD-Pg": {
        "build_requires": ["postgresql93-devel", "postgresql93"],
        "environment": { "POSTGRES_HOME": "/usr/pgsql-9.3" }
    }
}

See the README for a complete list of customization options. Or just get started with our dists.json file, which so far builds the bare minimum we need for one of our Perl apps. Add new distributions? Send a pull request! We’ll be doing so as we integrate more of our Perl apps with a Modern Perl and leave the sad RPM past behind.

Localize Your Perl Apps with this One Weird Trick

Nota Bene: This is a republication of a post that originally appeared in the 2013 Perl Advent Calendar.


These days, gettext is far and away the most widely-used localization (l10n) and internationalization (i18n) library for open-source software. So far, it has not been widely used in the Perl community, even though it’s the most flexible, capable, and easy-to use solution, thanks to Locale::TextDomain.1 How easy? Let’s get started!

Module Internationale

First, just use Locale::TextDomain. Say you’re creating an awesome new module, Awesome::Module. These CPAN distribution will be named Awesome-Module, so that’s the “domain” to use for its localizations. Just let Locale::TextDomain know:

use Locale::TextDomain 'Awesome-Module';

Locale::TextDomain will later use this string to look for the appropriate translation catalogs. But don’t worry about that just yet. Instead, start using it to translate user-visible strings in your code. With the assistance of the Locale::TextDomain’s comprehensive documentation, you’ll find it second nature to internationalize your modules in no time. For example, simple strings are denoted with __:

say __ 'Greetings puny human!';

If you need to specify variables, use __x:

say __x(
   'Thank you {sir}, may I have another?',
   sir => $username,
);

Need to manage plurals? Use __n:

say __n(
    'I will not buy this record, it is scratched.',
    'I will not buy these records, they are scratched.',
    $num_records
);

If $num_records is 1, the first phrase will be used. Otherwise the second.

Sometimes you gotta do both, mix variables and plurals. __nx has got you covered there:

say __nx(
    'One item has been grokked.',
    '{count} items have been grokked.',
    $num_items,
    count => $num_items,
);

Congratulations! Your module is now internationalized. Wasn’t that easy? Make a habit of using these functions in all the modules in your distribution, always with the Awesome-Module domain, and you’ll be set.

Encode da Code

Locale::TextDomain is great, but it dates from a time when Perl character encoding was, shall we say, sub-optimal. It therefore took it upon itself to try to do the right thing, which is to to detect the locale from the runtime environment and automatically encode as appropriate. Which might work okay if all you ever do is print localized messages — and never anything else.

If, on the other hand, you will be manipulating localized strings in your code, or emitting unlocalized text (such as that provided by the user or read from a database), then it’s probably best to coerce Locale::TextDomain to return Perl strings, rather than encoded bytes. There’s no formal interface for this in Locale::TextDomain, so we have to hack it a bit: set the $OUTPUT_CHARSET environment variable to “UTF-8” and then bind a filter. Don’t know what that means? Me neither. Just put this code somewhere in your distribution where it will always run early, before anything gets localized:

use Locale::Messages qw(bind_textdomain_filter);
use Encode;
BEGIN {
    $ENV{OUTPUT_CHARSET} = 'UTF-8';
    bind_textdomain_filter 'Awesome-Module' => \&Encode::decode_utf8, Encode::FB_DEFAULT;
}

You only have to do this once per domain. So even if you use Locale::TextDomain with the Awesome-Module domain in a bunch of your modules, the presence of this code in a single early-loading module ensures that strings will always be returned as Perl strings by the localization functions.

Environmental Safety

So what about output? There’s one more bit of boilerplate you’ll need to throw in. Or rather, put this into the main package that uses your modules to begin with, such as the command-line script the user invokes to run an application.

First, on the shebang line, follow Tom Christiansen’s advice and put -CAS in it (or set the $PERL_UNICODE environment variable to AS). Then use the POSIX setlocale function to the appropriate locale for the runtime environment. How? Like this:

#!/usr/bin/perl -CAS

use v5.12;
use warnings;
use utf8;
use POSIX qw(setlocale);
BEGIN {
    if ($^O eq 'MSWin32') {
        require Win32::Locale;
        setlocale POSIX::LC_ALL, Win32::Locale::get_locale();
    } else {
        setlocale POSIX::LC_ALL, '';
    }
}

use Awesome::Module;

Locale::TextDomain will notice the locale and select the appropriate translation catalog at runtime.

Is that All There Is?

Now what? Well, you could do nothing. Ship your code and those internationalized phrases will be handled just like any other string in your code.

But what’s the point of that? The real goal is to get these things translated. There are two parts to that process:

  1. Parsing the internationalized strings from your modules and creating language-specific translation catalogs, or “PO files”, for translators to edit. These catalogs should be maintained in your source code repository.

  2. Compiling the PO files into binary files, or “MO files”, and distributing them with your modules. These files should not be maintained in your source code repository.

Until a year ago, there was no Perl-native way to manage these processes. Locale::TextDomain ships with a sample Makefile demonstrating the appropriate use of the GNU gettext command-line tools, but that seemed a steep price for a Perl hacker to pay.

A better fit for the Perl hacker’s brain, I thought, is Dist::Zilla. So I wrote Dist::Zilla::LocaleTextDomain to encapsulate the use of the gettext utiltiies. Here’s how it works.

First, configuring Dist::Zilla to compile localization catalogs for distribution: add these lines to your dist.ini file:

[ShareDir]
[LocaleTextDomain]

There are configuration attributes for the LocaleTextDomain plugin, such as where to find the PO files and where to put the compiled MO files. In case you didn’t use your distribution name as your localization domain in your modules, for example:

use Locale::TextDomain 'com.example.perl-libawesome';

Then you’d set the textdomain attribute so that the LocaleTextDomain plugin can find the translation catalogs:

[LocaleTextDomain]
textdomain = com.example.perl-libawesome

Check out the configuration docs for details on all available attributes.

At this point, the plugin doesn’t do much, because there are no translation catalogs yet. You might see this line from dzil build, though:

[LocaleTextDomain] Skipping language compilation: directory po does not exist

Let’s give it something to do!

Locale Motion

To add a French translation file, use the msg-init command:2

% dzil msg-init fr
Created po/fr.po.

The msg-init command uses the GNU gettext utilities to scan your Perl source code and initialize the French catalog, po/fr.po. This file is now ready translation! Commit it into your source code repository so your agile-minded French-speaking friends can find it. Use msg-init to create as many language files as you like:

% dzil msg-init de ja.JIS en_US.UTF-8 en_UK.UTF-8
Created po/de.po.
Created po/ja.po.
Created po/en_US.po.
Created po/en_UK.po.

Each language has its on PO file. You can even have region-specific catalogs, such as the en_US and en_UK variants here. Each time a catalog is updated, the changes should be committed to the repository, like code. This allows the latest translations to always be available for compilation and distribution. The output from dzil build now looks something like:

po/fr.po: 10 translated messages, 1 fuzzy translation, 0 untranslated messages.
po/ja.po: 10 translated messages, 1 fuzzy translation, 0 untranslated messages.
po/en_US.po: 10 translated messages, 1 fuzzy translation, 0 untranslated messages.
po/en_UK.po: 10 translated messages, 1 fuzzy translation, 0 untranslated messages.

The resulting MO files will be in the shared directory of your distribution:

% find Awesome-Module-0.01/share -type f
Awesome-Module-0.01/share/LocaleData/de/LC_MESSAGES/Awesome-Module.mo
Awesome-Module-0.01/share/LocaleData/en_UK/LC_MESSAGES/Awesome-Module.mo
Awesome-Module-0.01/share/LocaleData/en_US/LC_MESSAGES/Awesome-Module.mo
Awesome-Module-0.01/share/LocaleData/ja/LC_MESSAGES/Awesome-Module.mo

From here Module::Build or ExtUtils::MakeMaker will install these MO files with the rest of your distribution, right where Locale::TextDomain can find them at runtime. The PO files, on the other hand, won’t be used at all, so you might as well exclude them from the distribution. Add this line to your MANIFEST.SKIP to prevent the po directory and its contents from being included in the distribution:

^po/

Mergers and Acquisitions

Of course no code base is static. In all likelihood, you’ll change your code — and end up adding, editing, and removing localizable strings as a result. You’ll need to periodically merge these changes into all of your translation catalogs so that your translators can make the corresponding updates. That’s what the the msg-merge command is for:

% dzil msg-merge
extracting gettext strings
Merging gettext strings into po/de.po
Merging gettext strings into po/en_UK.po
Merging gettext strings into po/en_US.po
Merging gettext strings into po/ja.po

This command re-scans your Perl code and updates all of the language files. Old messages will be commented-out and new ones added. Commit the changes and give your translators a holler so they can keep the awesome going.

Template Scan

The msg-init and msg-merge commands don’t actually scan your source code. Sort of lied about that. Sorry. What they actually do is merge a template file into the appropriate catalog files. If this template file does not already exist, a temporary one will be created and discarded when the initialization or merging is done.

But projects commonly maintain a permanent template file, stored in the source code repository along with the translation catalogs. For this purpose, we have the msg-scan command. Use it to create or update the template, or POT file:

% dzil msg-scan
extracting gettext strings into po/Awesome-Module.pot

From here on in, the resulting .pot file will be used by msg-init and msg-merge instead of scanning your code all over again. But keep in mind that, if you do maintain a POT file, future merges will be a two-step process: First run msg-scan to update the POT file, then msg-merge to merge its changes into the PO files:

% dzil msg-scan
extracting gettext strings into po/Awesome-Module.pot
% dzil msg-merge
Merging gettext strings into po/de.po
Merging gettext strings into po/en_UK.po
Merging gettext strings into po/en_US.po
Merging gettext strings into po/ja.po

Lost in Translation

One more thing, a note for translators. They can, of course, also use msg-scan and msg-merge to update the catalogs they’re working on. But how do they test their translations? Easy: use the msg-compile command to compile a single catalog:

% dzil msg-compile po/fr.po
[LocaleTextDomain] po/fr.po: 195 translated messages.

The resulting compiled catalog will be saved to the LocaleData subdirectory of the current directory, so it’s easily available to your app for testing. Just be sure to tell Perl to include the current directory in the search path, and set the $LC_ALL environment variable for your language. For example, here’s how I test the Sqitch French catalog:

% dzil msg-compile po/fr.po              
[LocaleTextDomain] po/fr.po: 148 translated messages, 36 fuzzy translations, 27 untranslated messages.
% LC_ALL=fr perl -Ilib -CAS -I. bin/sqitch foo
"foo" n'est pas une commande valide

Just be sure to delete the LocaleData directory when you’re done — or at least don’t commit it to the repository.

TL;DR

This may seem like a lot of steps, and it is. But once you have the basics in place — Configuring the Dist::Zilla::LocaleTextDomain plugin, setting up the “textdomain filter”, setting and the locale in the application — there are just a few habits to get into:

  • Use the functions __, __x, __n, and __nx to internationalize user-visible strings
  • Run msg-scan and msg-merge to keep the catalogs up-to-date
  • Keep your translators in the loop.

The Dist::Zilla::LocaleTextDomain plugin will do the rest.

Update 2019-01-31: Added required malformed data configuration argument to the call to bind_textdomain_filter required by this change in Encode v2.99.


  1. What about Locale::Maketext, you ask? It has not, alas, withsthood the test of time. For details, see Nikolai Prokoschenko’s epic 2009 polemic, “On the state of i18n in Perl.” See also Steffen Winkler’s presentation, Internationalisierungs-Framework auswählen (and the English translation by Aristotle Pagaltzis), from German Perl Workshop 2010↩︎

  2. The msg-init function — like all of the dzil msg-* commands — uses the GNU gettext utilities under the hood. You’ll need a reasonably modern version in your path, or else it won’t work. ↩︎

Lexical Subroutines

Ricardo Signes:

One of the big new experimental features in Perl 5.18.0 is lexical subroutines. In other words, you can write this:

my sub quickly { … } my @sorted = sort quickly @list;

my sub greppy (&@) { … } my @grepped = greppy { … } @input;

>
> These two examples show cases where lexical *references* to anonymous
> subroutines would not have worked. The first argument to `sort` must be a
> block or a subroutine *name*, which leads to awful code like this:
>
> ``` perl
sort { $subref->($a, $b) } @list

With our greppy, above, we get to benefit from the parser-affecting behaviors of subroutine prototypes.

My favorite tidbit about this feature? Because lexical subs are lexical, and method-dispatch is package-based, lexical subs are not subject to method lookup and dispatch! This just might alleviate the confusion of methods and subs, as chromatic complained about just yesterday. Probably doesn’t solve the problem for imported subs, though.

TPF To Revamp Grants

Alberto Simões:

Nevertheless, this lack of “lower than $3000” grant proposals, and the fact that lot of people have been discussing (and complaining) about this value being too low, the Grants Committee is starting a discussion on rewriting and reorganizing the way it works. Namely, in my personal blog I opened a discussion about the Grants Committee some time ago, and had plenty of feedback, that will be helpful for our internal discussion.

This is great news. I would love to see more and more ambitious grant proposals, as well as awards people an subsist on. I look forward to seeing the new rules.

Mopping the Moose

Stevan Little:

I spent much of last week on vacation with the family so very little actual coding got done on the p5-mop, but instead I did a lot of thinking. My next major goal for the p5-mop is to port a module written in Moose, in particular, one that uses many different Moose features. The module I have chosen to port is Bread::Board and I chose it for two reasons; first, it was the first real module that I wrote using Moose and second, it makes heavy use of a lot of Moose’s features.

I’m so happy to see Stevan making progress on the Perl 5 MOP again.

A Perl Blog

I have been unsatisfied with Just a Theory for some time. I started that blog in 2004 more or less for fun, thinking it would be my permanent home on the internet. And it has been. But the design, while okay in 2004, is just awful by today’s standards. A redesign is something I have planned to do for quite some time.

I had also been thinking about my audience. Or rather, audiences. I’ve blogged about many things, but while a few dear family members might want to read everything I ever post, most folks, I think, are interested in only a subset of topics. Readers of Just a Theory came for posts about Perl, or PostgreSQL, or culture, travel, or politics. But few came for all those topics, in my estimation.

More recently, a whole bunch of top-level domains have opened up, often with the opportunity for anyone to register them. I was lucky enough to snag theory.pm and theory.pl, thinking that perhaps I would create a site just for blogging about Perl. I also nabbed theory.so, which I might dedicate to database-related blogging, and theory.me, which would be my personal blog (travel, photography, cultural essays, etc.).

And then there is Octopress. A blogging engine for hackers. Perfect for me. Hard to imagine something more appropriate (unless it was written in Perl). It seemed like a good opportunity to partition my online blogging.

So here we are with my first partition. theory.pm is a Perl blog. Seemed like the perfect name. I fiddled with it off and on for a few months, often following Matt Gemmell’s Advice, and I’m really happy with it. The open-source fonts Source Sans Pro and Source Code Pro, from Adobe, look great. The source code examples are beautifully marked up and displayed using the Solarized color scheme (though presentation varies in feed readers). Better still, it’s equally attractive and readable on computers, tablets and phones, thanks to the foundation laid by Aron Cedercrantz’s BlogTheme.

I expect to fork this code to create a database blog soon, and then perhaps put together a personal blog. Maybe the personal blog will provide link posts for posts on the other sites, so that if anyone really wants to read everything, they can. I haven’t decided yet.

In the meantime, now that I have a dedicated Perl blog, I guess I’ll have to start writing more Perl-related stuff. I’m starting with some posts about the state of exception handling in Perl 5, the first of which is already up. Stay tuned for more.

Update: 2018-05-23: I decided that a separate Perl blog wasn’t a great idea after all, and relaunched Just a Theory.

Trying Times

Exception handling is a bit of a pain in Perl. Traditionally, we use eval {}:

eval {
    foo();
}
if (my $err = $@) {
    # Inspect $err…
}

The use of the if block is a bit unfortunate; worse is the use of the global $@ variable, which has inflicted unwarranted pain on developers over the years.1 Many Perl hackers put Try::Tiny to work to circumvent these shortcomings:

try {
    foo();
} catch {
    # Inspect $_…
};

Alas, Try::Tiny introduces its own idiosyncrasies, particularly its use of subroutine references rather than blocks. While a necessity of a pure-Perl implementation, it prevents returning from the calling context. One must work around this deficiency by checking return values:

my $rv = try {
   f();
} catch {
   # …
};

if (!$rv) {
   return;
}

I can’t tell you how often this quirk burns me.

Sadly, there is a deeper problem then syntax: Just what, exactly, is an exception? How does one determine the exceptional condition, and what can be done about it? It might be a string. The string might be localized. It might be an Exception::Class object, or a Throwable object, or a simple array reference. Or any other value a Perl scalar can hold. This lack of specificity requires careful handling of exceptions:

if (my $err = $@) {
    if (ref $err) {
        if (eval { $err->isa('Exception::Class') }) {
            if ( $err->isa('SomeException') ) {
                # …
            } elsif ( $err->isa('SomeException') ) {
                # …
            } else {
                # …
            }
        } elsif (eval { $err->DOES('Throwable') }) {
            # …
        } elsif ( ref $err eq 'ARRAY') {
            # …
        }
    } else {
        if ( $err =~ /DBI/ ) {
            # …
        } elsif ( $err =~ /cannot open '([^']+)'/ ) {
            # …
        }
    }
}

Not every exception handler requires so many conditions, but I have certainly exercised all these approaches. Usually my exception handlers accrete condition as users report new, unexpected errors.

That’s not all. My code frequently requires parsing information out of a string error. Here’s an example from PGXN::Manager:

try {
    $self->distmeta(decode_json scalar $member->contents );
} catch {
    my $f = quotemeta __FILE__;
    (my $err = $_) =~ s/\s+at\s+$f.+//ms;
    $self->error([
        'Cannot parse JSON from “[_1]”: [_2]',
        $member->fileName,
        $err
    ]);
    return;
} or return;

return $self;

When JSON throws an exception on invalid JSON, the code must catch that exception to show the user. The user cares not at all what file threw the exception, nor the line number. The code must strip that stuff out before passing the original message off to a localizing error method.

Gross.

It’s time to end this. A forthcoming post will propose a plan for adding proper exception handling to the core Perl language, including exception objects and an official try/catch syntax.


  1. In fairness much of the $@ pain has been addressed in Perl 5.14↩︎

Sqitch on Windows (and Linux, Solaris, and OS X)

Thanks to the hard-working hamsters at the ActiveState PPM Index, Sqitch is available for installation on Windows. According to the Sqitch PPM Build Status, the latest version is now available for installation. All you have to do is:

  1. Download and install ActivePerl
  2. Open the Command Prompt
  3. Type ppm install App-Sqitch

As of this writing, only PostgreSQL is supported, so you will need to install PostgreSQL.

But otherwise, that’s it. In fact, this incantation works for any OS that ActivePerl supports. Here’s where you can find the sqitch executable on each:

  • Windows: C:\perl\site\bin\sqitch.bat
  • Mac OS X: ~/Library/ActivePerl-5.16/site/bin/sqitch (Or /usr/local/ActivePerl-5.16/site/bin if you run sudo ppm)
  • Linux: /opt/ActivePerl-5.16/site/bin/sqitch
  • Solaris/SPARC (Business edition-only): /opt/ActivePerl-5.16/site/bin/sqitch

This makes it easy to get started with Sqitch on any of those platforms without having to become a Perl expert. So go for it, and then get started with the tutorial!

Looking for the comments? Try the old layout.

Dist::Zilla::LocaleTextDomain for Translators

Here’s a followup on my post about localizing Perl modules with Locale::TextDomain. Dist::Zilla::LocaleTextDomain was great for developers, less so for translators. A Sqitch translator asked how to test the translation file he was working on. My only reply was to compile the whole module, then install it and test it. Ugh.

Today, I released Dist::Zilla::LocaleTextDomain v0.85 with a new command, msg-compile. This command allows translators to easily compile just the file they’re working on and nothing else. For pure Perl modules in particular, it’s pretty easy to test then. By default, the compiled catalog goes into ./LocaleData, where convincing the module to find it is simple. For example, I updated the test sqitch app to take advantage of this. Now, to test, say, the French translation file, all the translator has to do is:

> dzil msg-compile po/fr.po
[LocaleTextDomain] po/fr.po: 155 translated messages, 24 fuzzy translations, 16 untranslated messages.

> LANGUAGE=fr ./t/sqitch foo
"foo" n'est pas une commande valide

I hope this simplifies things for translators. See the notes for translators for a few more words on the subject.

Looking for the comments? Try the old layout.

Localize Your Perl modules with Locale::TextDomain and Dist::Zilla

I’ve just released Dist::Zilla::LocaleTextDomain v0.80 to the CPAN. This module adds support for managing Locale::TextDomain-based localization and internationalization in your CPAN libraries. I wanted to make it as simple as possible for CPAN developers to do localization and to support translators in their projects, and Dist::Zilla seemed like the perfect place to do it, since it has hooks to generate the necessary binary files for distribution.

Starting out with Locale::TextDomain was decidedly non-intuitive for me, as a Perl hacker, likely because of its gettext underpinnings. Now that I’ve got a grip on it and created the Dist::Zilla support, I think it’s pretty straight-forward. To demonstrate, I wrote the following brief tutorial, which constitutes the main documentation for the Dist::Zilla::LocaleTextDomain distribution. I hope it makes it easier for you to get started localizing your Perl libraries.

Localize Your Perl modules with Locale::TextDomain and Dist::Zilla

Locale::TextDomain provides a nice interface for localizing your Perl applications. The tools for managing translations, however, is a bit arcane. Fortunately, you can just use this plugin and get all the tools you need to scan your Perl libraries for localizable strings, create a language template, and initialize translation files and keep them up-to-date. All this is assuming that your system has the gettext utilities installed.

The Details

I put off learning how to use Locale::TextDomain for quite a while because, while the gettext tools are great for translators, the tools for the developer were a little more opaque, especially for Perlers used to Locale::Maketext. But I put in the effort while hacking Sqitch. As I had hoped, using it in my code was easy. Using it for my distribution was harder, so I decided to write Dist::Zilla::LocaleTextDomain to make life simpler for developers who manage their distributions with Dist::Zilla.

What follows is a quick tutorial on using Locale::TextDomain in your code and managing it with Dist::Zilla::LocaleTextDomain.

This is my domain

First thing to do is to start using Locale::TextDomain in your code. Load it into each module with the name of your distribution, as set by the name attribute in your dist.ini file. For example, if your dist.ini looks something like this:

name    = My-GreatApp
author  = Homer Simpson <homer@example.com>
license = Perl_5

Then, in you Perl libraries, load Locale::TextDomain like this:

use Locale::TextDomain qw(My-GreatApp);

Locale::TextDomain uses this value to find localization catalogs, so naturally Dist::Zilla::LocaleTextDomain will use it to put those catalogs in the right place.

Okay, so it’s loaded, how do you use it? The documentation of the Locale::TextDomain exported functions is quite comprehensive, and I think you’ll find it pretty simple once you get used to it. For example, simple strings are denoted with __:

say __ 'Hello';

If you need to specify variables, use __x:

say __x(
    'You selected the color {color}',
    color => $color
);

Need to deal with plurals? Use __n

say __n(
    'One file has been deleted',
    'All files have been deleted',
    $num_files,
);

And then you can mix variables with plurals with __nx:

say __nx(
    'One file has been deleted.',
    '{count} files have been deleted.',
    $num_files,
    count => $num_files,
);

Pretty simple, right? Get to know these functions, and just make it a habit to use them in user-visible messages in your code. Even if you never expect to translate those messages, just by doing this you make it easier for someone else to come along and start translating for you.

The setup

Now you’re localizing your code. Great! What’s next? Officially, nothing. If you never do anything else, your code will always emit the messages as written. You can ship it and things will work just as if you had never done any localization.

But what’s the fun in that? Let’s set things up so that translation catalogs will be built and distributed once they’re written. Add these lines to your dist.ini:

[ShareDir]
[LocaleTextDomain]

There are actually quite a few attributes you can set here to tell the plugin where to find language files and where to put them. For example, if you used a domain different from your distribution name, e.g.,

use Locale::TextDomain 'com.example.My-GreatApp';

Then you would need to set the textdomain attribute so that the LocaleTextDomain does the right thing with the language files:

[LocaleTextDomain]
textdomain = com.example.My-GreatApp

Consult the LocaleTextDomain configuration docs for details on all available attributes.

(Special note until this Locale::TextDomain patch is merged: set the share_dir attribute to lib instead of the default value, share. If you use Module::Build, you will also need a subclass to do the right thing with the catalog files; see “Installation” in Dist::Zilla::Plugin::LocaleTextDomain for details.)

What does this do including the plugin do? Mostly nothing. You might see this line from dzil build, though:

[LocaleTextDomain] Skipping language compilation: directory po does not exist

Now at least you know it was looking for something to compile for distribution. Let’s give it something to find.

Initialize languages

To add translation files, use the msg-init command:

> dzil msg-init de
Created po/de.po.

At this point, the gettext utilities will need to be installed and visible in your path, or else you’ll get errors.

This command scans all of the Perl modules gathered by Dist::Zilla and initializes a German translation file, named po/de.po. This file is now ready for your German-speaking public to start translating. Check it into your source code repository so they can find it. Create as many language files as you like:

> dzil msg-init fr ja.JIS en_US.UTF-8
Created po/fr.po.
Created po/ja.po.
Created po/en_US.po.

As you can see, each language results in the generation of the appropriate file in the po directory, sans encoding (i.e., no .UTF-8 in the en_US file name).

Now let your translators go wild with all the languages they speak, as well as the regional dialects. (Don’t forget to colour your code with en_UK translations!)

Once you have translations and they’re committed to your repository, when you build your distribution, the language files will automatically be compiled into binary catalogs. You’ll see this line output from dzil build:

[LocaleTextDomain] Compiling language files in po
po/fr.po: 10 translated messages, 1 fuzzy translation, 0 untranslated messages.
po/ja.po: 10 translated messages, 1 fuzzy translation, 0 untranslated messages.
po/en_US.po: 10 translated messages, 1 fuzzy translation, 0 untranslated messages.

You’ll then find the catalogs in the shared directory of your distribution:

> find My-GreatApp-0.01/share -type f
My-GreatApp-0.01/share/LocaleData/de/LC_MESSAGES/App-Sqitch.mo
My-GreatApp-0.01/share/LocaleData/en_US/LC_MESSAGES/App-Sqitch.mo
My-GreatApp-0.01/share/LocaleData/ja/LC_MESSAGES/App-Sqitch.mo

These binary catalogs will be installed as part of the distribution just where Locale::TextDomain can find them.

Here’s an optional tweak: add this line to your MANIFEST.SKIP:

^po/

This prevents the po directory and its contents from being included in the distribution. Sure, you can include them if you like, but they’re not required for the running of your app; the generated binary catalog files are all you need. Might as well leave out the translation files.

Mergers and acquisitions

You’ve got translation files and helpful translators given them a workover. What happens when you change your code, add new messages, or modify existing ones? The translation files need to periodically be updated with those changes, so that your translators can deal with them. We got you covered with the msg-merge command:

> dzil msg-merge
extracting gettext strings
Merging gettext strings into po/de.po
Merging gettext strings into po/en_US.po
Merging gettext strings into po/ja.po

This will scan your module files again and update all of the translation files with any changes. Old messages will be commented-out and new ones added. Just commit the changes to your repository and notify the translation army that they’ve got more work to do.

If for some reason you need to update only a subset of language files, you can simply list them on the command-line:

> dzil msg-merge po/de.po po/en_US.po
Merging gettext strings into po/de.po
Merging gettext strings into po/en_US.po

What’s the scan, man

Both the msg-init and msg-merge commands depend on a translation template file to create and merge language files. Thus far, this has been invisible: they will create a temporary template file to do their work, and then delete it when they’re done.

However, it’s common to also store the template file in your repository and to manage it directly, rather than implicitly. If you’d like to do this, the msg-scan command will scan the Perl module files gathered by Dist::Zilla and make it for you:

> dzil msg-scan
gettext strings into po/My-GreatApp.pot

The resulting .pot file will then be used by msg-init and msg-merge rather than scanning your code all over again. This actually then makes msg-merge a two-step process: You need to update the template before merging. Updating the template is done by exactly the same command, msg-scan:

> dzil msg-scan
extracting gettext strings into po/My-GreatApp.pot
> dzil msg-merge
Merging gettext strings into po/de.po
Merging gettext strings into po/en_US.po
Merging gettext strings into po/ja.po

Ship It!

And that’s all there is to it. Go forth and localize and internationalize your Perl apps!

Acknowledgements

My thanks to Ricardo Signes for invaluable help plugging in to Dist::Zilla, to Guido Flohr for providing feedback on this tutorial and being open to my pull requests, to David Golden for I/O capturing help, and to Jérôme Quelin for his patience as I wrote code to do the same thing as Dist::Zilla::Plugin::LocaleMsgfmt without ever noticing that it already existed.

Looking for the comments? Try the old layout.

Use of DBI in Sqitch

Sqitch uses the native database client applications (psql, sqlite3, mysql, etc.). So for tracking metadata about the state of deployments, I have been trying to stick to using them. I’m first targeting PostgreSQL, and as a result need to open a connection to psql, start a transaction, and be able to read and write stuff to it as migrations go along. The IPC is a huge PITA. Furthermore, getting things properly quoted is also pretty annoying — and it will be worse for SQLite and MySQL, I expect (psql’s --set support is pretty slick).

If, on the other hand, I used the DBI, on the other hand, all this would be very easy. There is no IPC, just a direct connection to the database. It would save me a ton of time doing development, and be robust and safer to use (e.g., exception handling rather than platform-dependent signal handling (or not, in the case of Windows)). I am quite tempted to just so that.

However, I have been trying to be sensitive to dependencies. I had planned to make Sqitch simple to install on any system, and if you had the command-line client for your preferred database, it would just work. If I used the DBI instead, then Sqitch would not work at all unless you installed the appropriate DBI driver for your database of choice. This is no big deal for Perl people, of course, but I don’t want this to be a Perl people tool. I want it to be dead simple for anyone to use for any database. Ideally, there will be RPMs and Ubuntu packages, so one can just install it and go, and not have to worry about figuring out what additional Perl DBD to install for your database of choice. It should be transparent.

That is still my goal, but at this point the IPC requirements for controlling the clients is driving me a little crazy. Should I just give up and use the DBI (at least for now)? Or persevere with the IPC stuff and get it to work? Opinions wanted!

Looking for the comments? Try the old layout.