I’ve used Subversion very occasionally since 2009, and SVN::Notify at
all. Over the years, I’ve fixed minor issues with it now and then, and made
the a couple of releases to address issues fixed by others. But it’s past the
point where I feel qualified to maintain it. Hell, the repository for
SVN::Notify has been hosted on GitHub ever since 2011. I don’t have an
instance of Subversion against which to test it; nor do I have any SMTP
servers to throw test messages at.
In short, it’s past time I relinquished maintenance of this module to someone
with a vested interest in its continued use. Is that you? Do you need to keep
SVN::Notify running for your projects, and have a few TUITs to fix the
occasional bug or security issue? If so, drop me a line (david @ this domain).
I’d be happy to transfer the repository.
I’ve kept my various Perl modules in a Subversion server run by my Bricolage
support company, Kineticode, for many years. However, I’m having to shut down
the server I’ve used for all my services, including Subversion, so I’ve moved
them all to GitHub. As such, I no longer use Subversion in my day-to-day work.
It no longer seems appropriate that I maintain SVN::Notify. This has probably
been my most popular modules, and I know that it’s used a lot. It’s also
relatively stable, with few bug reports or complaints. Nevertheless, there
certainly could be some things that folks want to add, like TLS support,
I18N, and inline CSS.
Therefore, SVN::Notify is formally up for adoption. If you’re a Subversion
users, it’s a great tool. Just look at this sample output. If you’d like to
take over maintenance, make it even better, please get in touch. Leave a comment
on this post, or @theory me on Twitter, or send an email.
PS: Would love it if someone also could take over activitymail, the CVS
notification script from which SVN::Notify was derived — and which I have even
less right to maintain, given that I haven’t used CVS in years.
Just a quick followup on the completion of the Bricolage Git migration last
week, today I completed writing up a set of GitHub wiki documents explaining
to my fellow Bricoleurs how to start hacking. The most important bits are:
Working with Git, explaining how to get set up with a forked Bricolage
repository
Contributing a Bug Fix, an intro to the Git way of doing things (as far as
I understand it)
Creating a Release, in which the fine art of branching, tagging, and
releasing is covered
If you’re familiar with the “Git way,” I would greatly appreciate your feedback
on these documents. Corrections and comments would be greatly appreciated.
I also just wanted to say that the process of reconstructing the merge history
from CVS and Subversion was quite an eye-opener for me. Not because it was
difficult (it was) and required a number of hacks (it did), but because it
highlighted just how much better a fit Git is for the way in which we do Open
Source software development. Hell, probably closed-source, too, for that matter.
I no longer will have to think about what revisions to include in a merge, or
create a branch just to “tag” a merge. Hell, I’ll probably be doing merges a
hell of a lot more often, just because it’s so easy, the history remains intact,
and everything just stays more up-to-date and closely integrated.
But I also really appreciate the project-based emphasis of Git. A Subversion
repository, I now realize, is really very much like a versioned file system.
That means where things go is completely ad-hoc, or convention-driven at best.
And god forbid if you decide to change the convention and move stuff around!
It’s just so much more sane to get a project repository, with all of the
history, branches, tags, merges, and everything else, all in one package. It’s
more portable, it’s a hell of a lot faster (ever tried to check out a Subversion
repository with 80 tags?), and just tighter. it encourages modularization,
which can only be good. I’ll tell you, I expect to have some frustrations and
challenges as I learn more about using Git, but I’m already very much happier
with the overall philosophy.
Enough evangelizing. As a last statement on this, I’ve uploaded the Perl scripts
I wrote to do this migration, just in case someone else finds them useful:
bric_to_git migrated Subversion from r5517 to Git.
stitch stitched the CVS-migrated Git repository into the
Subversion-migrated Git repository for a final product.
It turned out that there were a few files lost in the conversion, which I
didn’t notice until after all was said and done, but overall I’m very happy. My
thanks again to Ask and the denizens of #git for all the help.
Now that I’ve successfully migrated the old Bricolage SourceForge CVS
repository to Git, and also migrated Subversion to Git, it’s time to stitch
the two repositories together into one with all history intact. I’m glad to say
that figuring out how to do so took substantially less time than the first two
steps, thanks in large part to the great help from “doener,” “Ilari,” and
“Fissure” on the Freenode #git channel.
Actually, they helped me with a bit more tweaking of my CVS and Subversion
conversions. One thing I realized after writing yesterday’s post was that, after running git filter-branch, I had twice as many
commits as I should have had. It turns out that git filter-branch rewrites all
commits, but keeps the old ones around in case you mess something up. doener
also pointed out that I wasn’t having all grafts properly applied, because
git filter-branch only applies to the currently checked-out branch. To get all
of the branches, he suggested that I read the git-filter-branch documentation,
where I’ll find that git filter-branch --tag-name-filter cat -- --all would
hit all branches. Actually, such was not clear to me from the documentation, but
I took his word for it. Once I did that, to get rid of the dupes, all I had to
do was git clone the repository to a new repository. And that was that.
This worked great for my CVS migration, but I realized that I also wanted to
clean out metadata from the Subversion migration. Of course, git clone throws
out most of the metadata, but git svn also stores some metadata at the end of
every commit log message, like this:
This had been very handy as I looked through commits in GitX to find parents to
set up for grafts, but with that done and everything grafted, I no longer needed
it. Ilari helped me to figure out how to properly use git filter-branch to get
rid of those. To do it, all I had to do was add a filter for commit messages,
like so:
This properly strips out that ugly bit of metadata and finalizes the grafts all
at the same time. Very nice.
Now it was time to combine these two repositories for a single unified
history. I wasn’t able to find a good tutorial for this on the web, other than
one that used a third-party Debian utility and only hooked up the master
branch, using a bogus intermediary commit to do it. On the other hand, simply
copying the pack files, as mentioned in the Git Wiki–and demonstrated by the
scripts linked from there–also appeared to be suboptimal: The new commits were
not showing up in GitX! And besides, Ilari said, “just copying packs might not
suffice. There can also be loose objects.” Well, we can’t have that, can we?
Ilari suggested git-fetch, the documentation for which says that it will
“download objects and refs from another repository.” Perfect! I wanted to copy
the objects from my CVS migration to the Subversion migration.
My first attempt failed: some commits showed up, but not others. Ilari pointed
out that it wouldn’t copy remote branches unless you asked it to do so, via
“refspecs.” Since I’d cloned the repositories to get rid of the duplicate
commits created by git filter-branch, all of my lovingly recreated local
branches were now remote branches. Actually, this is what I want for the final
repository, so I just had to figure out how to copy them. What I came up with
was this:
It took me a while to figure out the proper incantation for referencing and
creating remote branches. Once I got the refs/remotes part figured out, I
found that the master, rev_1_6, and rev_1_8 branches from CVS were
overwriting the Subversion branches with the same names. What I really needed
was to have the CVS branches grafted as parents to the Subversion branches. The
#git channel again came to my rescue, where Fissure suggested that I rename
those branches when importing them, do the grafts, and then drop the renamed
branches. Hence the line above that adds “-cvs” to the names of those branches.
Once the branches were imported, I simply looked for the earliest commits to
those branches in Subversion and mapped it to the latest commits to the same
branches in CVS, then wrote their SHA1 IDs to .git/info/grafts, like so:
openmy$fh,'>',".git/info/grafts"ordie"Cannot open grafts: $!\n";print$fh'77a35487f18d68b96d294facc1f1a41745ad914c '=>"835ff47ee1e3d1bf228b8d0976fbebe3c7f02ae6\n",# rev_1_6'97ef646f5c2a7c6f47c2046c8d289c1dfc30a73d '=>"2b9f3c5979d062614ef54afd0a01631f746fa3cb\n",# rev_1_8'b3b2e7f53d789bea962fe8047e119148e28865c0 '=>"8414b64a6a434b2117294c0568c1012a17bc863b\n",# master;close$fh;
With the branches all imported and the grafts created, I simply had to run
git filter-branch to make them permanent and drop the temporary CVS branches:
Now I had a complete repository, but with duplicate commits left over by
git-filter-branch. To get rid of those, I need to clone the repository. But
before I clone, I need the remote branches to be local branches, so that the
clone will see them as remotes. For this, I wrote the following function:
It’s important to skip the master and HEAD branches, as they’ll automatically be
created by git clone. So then I call the function and and run git gc to take
out trash, and then clone:
It’s important to use the file:/// URL to clone so as to get a real clone;
just pointing to the directory instead makes hard links.
Now I that I had the final repository with all history intact, I was ready to
push it to GitHub! Well, almost ready. First I needed to make the branches local
again, and then see if I could get the repository size down a bit:
And that’s it! My new Bricolage Git repository is complete, and I’ve now pushed
it up to its new home on GitHub. I pushed it like this:
git push origin --all
git push origin --tags
Damn I’m glad that’s done! I’ll be getting the Subversion repository set to
read-only next, and then writing some documentation for my fellow Bricoleurs on
how to work with Git. For those of you who already know, fork and enjoy!
Following up on last week’s post on migrating the old Bricolage SourceForge
CVS repository to Git, here are my notes on migrating the current Bricolage
Subversion repository to Git.
It turns out that migrating from Subversion is much more of a pain than
migrating from CVS. Why? Because CVS has real tags, while Subversion does not.
So while git-svn tries to identify all of your tags and branches, it’s really
relying on your Subversion repository using standard directories for all of your
branches and tags. And while we’ve used a standard for branches directory, our
tags setup is a bit more complicated.
The problem was that we used tags every time we merged between branches. This
meant that we ended up with a lot of tags with names like
“merge_rev_1_10_5665” to indicate a merge from the “rev_1_10” branch into
trunk at r5665. Plus we had tags for releases. So Marshall took it upon
himself to reorganize the tags in the Subversion tree so that all release tags
went into the “releases” subdirectory, and merges went into subdirectories named
for the branch from which the merge derived. Those subdirectories went into the
“merges” subdirectory. We ended up with a directory structure organized like
this:
This was useful for keeping things organized in Subversion, so that we could
easily find a tag for a previous merge in order to determine the revisions to
specify for a new merge. But because older tags were moved from previous
locations, and because newer tags were in subdirectories of the “tags”
directory, git-svn did not identify them as tags. Well, that’s not really
fair. It did identify earlier tags, before they were moved, but all the other
tags were not found. Instead I ended up with tags in Git named tags/releases
and tags/merges, which was useless. But even if all of our tags had been
identified as tags, none had parent commit IDs, so there was no place to see
where they actually came from.
So to rebuild the commit, release, and merge history from Subversion, I first
created a local copy of the subversion repository using svnsync. Then I cloned
it to Git like so:
By starting with r5517, which was the first real commit to Subversion, I avoided
the git-svn error I reported last week. In truth, though, I ended up running
this clone many, many times. The first few times, I ran it with
--no-metadata, as recommended in various HOWTOs. But then I kept getting
errors such as:
git svn log
fatal: bad default revision 'refs/remotes/git-svn'
----------------------------------------------------
This was more than a little annoying, and it took me a day or so to realize that
this was because I had been using --no-metadata. Once I killed off that
option, things worked much better
Furthermore, by starting at r5517 and passing the --no-follow-parent option,
git-svn ran much more quickly. Rather than taking 30 hours to get all
revisions including stuff that had been moved around (and then failing), it now
took around 90 minutes to do the export. Much more manageable, although I also
started making backup copies and restoring from them as I experimented with
fixing branches and tags. Ultimately, I ended up also passing the
--ignore-paths option, to exclude various branches that were never really used
or that I had already fetched in their entirety from CVS:
The call to svn2git converts remote branches to local tags and branches. Now I
had a reasonably clean copy of the repository (aside from the 120 or so commits
from when Marshall did the tags reorganization) for me to work with. I opened it
up with GitX and started scripting out merges.
To assist in this, I took a hint from Ask Bjørn Hansen, sent in email in
response to a Tweet, and tagged every single commit with its corresponding
Subversion revision number, like so (in Perl):
The nice thing about this is that it made it easy for me to scan through the
commits in GitX and see where things were. It also meant that I could reference
these tags when I wrote the code to manage the merges. So what I did was sort
the commits in reverse chronological order, and then search for those with the
word “merge” in their subjects. When one was clearly for a merge (as opposed to
simply using the word “merge”), I would disable the search, scroll through the
commits until I found the selected commit, and then look for a likely prior
commit that it merged from.
This was a bit of pain in the ass, because, unfortunately, GitX doesn’t keep the
selected commit record in the middle of the screen when you cancel the search.
Mail.app does this right: If I do a search, select a message, then cancel the
search, the selected message is still in the middle of the screen. But with
GitX, as I said, I have to scroll to find it. This wasn’t going to scale very
well. So what I did instead was search for “merge”, then I took a screen shot of
the results and cancelled the merge. Then I just opened the screenshot in
Preview, looked at the records there, then found them in GitX. This made things
go quite a bit faster.
As a result, I added a migration function to properly tag merges. It looked like
this:
By referencing revision tags explicitly, I was able to just use git rev-parse
to look up SHA1 hash IDs to put into .git/info/grafts. This saved me the
headache of dealing with very long IDs, but also allowed me to easily keep track
of revision numbers and branches (the branch information is actually superfluous
here, but I kept it for my sanity). So, basically, for
[qw( trunk@5524 rev_1_8@5523 )], it ends up writing the SHA1 hashes for r5524,
the existing parent commit for r5524 (that’s the $commit^ bit), and for the
new parent, r5523. I ended up with 73 merges that needed to be properly
recorded.
With the merges done, I next dove into branches. For some reason, git-svn
failed to identify a parent commit for any branch. Maybe because I started
with r5517? I have no idea. So I had to search through the commits to see when
branches were started. I mainly did this by looking at the branches in ViewVC.
By clicking each one, I was able to see the earliest commit, which usually had a
name like “Created a branch for my SoC project.” I would then look up that
commit in ViewVC, such as r7423, which started the “dev_ajax” branch, just to
make sure that it was copied from trunk. Then I simply went into GitX, found
r7423, then looked back to the last commit to trunk before r7423. That was the
parent of the branch. With such data, I was able to write a function like this:
Here I only needed to look up the revision and its parent and write it to
.git/info/grafts. Then all of my branches had parents. Or nearly all of them;
those that were also in the old CVS repository will have to wait until the two
are stitched together to find their parents.
Next I needed to get releases properly tagged. This was not unlike the merge tag
work: I just had to find the proper revision and tag it. This time, I looked
through the commits in GitX for those with “tag for” in their subjects because,
conveniently, I nearly always used this phrase in a release tag, as in “Tag for
the 1.8.11 release of Bricolage.” Then I just looked back from the tag commit to
find the commit copied to the tag, and that commit would be tagged with the
release tag. The function to create the tags looked like this:
subtag_releases{print"Tagging releases\n";formy$spec(['rev_1_8@5726'=>'v1.8.1'],['rev_1_8@5922'=>'v1.8.2'],['rev_1_8@6073'=>'v1.8.3'],){my($where,$tag)=@{$spec};my($branch,$rev)=split/[@]/,$where;my$tag_date=`git show --pretty=format:%cd -s $rev`;chomp$tag_date;local$ENV{GIT_COMMITTER_DATE}=$tag_date;systemqw(git tag -fa),$tag,'-m',"Tag for $tag release of Bricolage.",$rev;}}
I am again indebted to Ask for the code here, especially to
set the date for the tag.
Since I had created new release tags and recreated the merge history in Git, I
no longer needed the old tags from Subversion, so next I rewrote the
--ignore-paths option to exclude all of the tags directories, as well as some
branches that were never used:
With this in hand, I killed off the call to svn2git, opting to convert trunk
and the remote branches myself (easily done by copying-and-pasting the relevant
Perl code). Then all I needed to do was clean up the extant tags and run
git-filter-branch to make the grafts permanent:
subfinish{print"Deleting old tags\n";my@tags=grepm{^tags/},map{s/^\s+//;s/\s+$//;$_}`git branch -a`;systemqw(git branch -r -D),$_for@tags;print"Deleting revision tags\n";@tags_to_delete=grep{/^\d+$/}map{s/^\s+//;s/\s+$//;$_}`git tag`;systemqw(git tag -d),$_for@tags_to_delete;print"Grafting...\n";systemqw(git filter-branch);systemqw(git gc);}
And now I have a nicely organized Git repository based on the Bricolage
Subversion repository, with all (or most) merges in their proper places, release
tags, and branch tracking. Now all I have to do is stitch it together with the
repository based on CVS and I’ll be ready to put this sucker on GitHub!
More on that in my next post.
Following a discussion on the Bricolage developers mail list, I started down
the path last week of migrating the Bricolage Subversion repository to Git. This
turned out to be much more work than I expected, but to the benefit of the
project, I think. Since I had a lot of questions about how to do certain things
and how Git thinks about certain things, I wanted to record what I worked out
here over the course of a few entries. Maybe it will help you manage your
migration to Git.
The first thing I tried to do was use git-svn to migrate Bricolage to Git. I
pointed it to the root directory and let it rip. I immediately saw that it
noticed that the root was originally at the root of the repository, rather than
the “bricolage” subdirectory, and so followed that path and started pulling
stuff down. In a separate terminal window, I was watching the branches build up,
and there were a lot of them, many named like:
David
David@5248
David@584
tags/Release_1_2_1
tags/Release_1_2_1@5249
tags/Release_1_2_1@577
Although many of those branches and tags hadn’t been used since the beginning of
time, and certainly not since Bricolage was moved to Subversion from its
original home in SourceForge CVS, because Subversion has no real concept of
branches or tags, git-svn was duly copying them all, including the separate
histories for each. Yow.
I could have dealt with that, renaming things, deleting others, and grafting
where appropriate (more on grafting in a minute), but then I got this error from
git-svn:
bricolage/branches/rev_1_8/lib/Bric/App/ApacheConfig.pm was not
found in commit e5145931069a511e98a087d4cb1a8bb75f43f899 (r5256)
This was annoying, especially since the file clearly does exist in that
commit:
svn list -r5256 http://svn.bricolage.cc/bricolage/branches/rev_1_8/lib/Bric/App/ApacheConfig.pm
ApacheConfig.pm
I posted to the Git mail list about this issue, but unfortunately got no
reply. Given that it was taking around 30 hours(!) to get to that point (and
about 18 hours once I started using a local copy of the Subversion repository,
thank to a suggestion from Ask Bjørn Hansen), I started thinking about how to
simplify things a bit.
Since most of the moving stuff around happened immediately after the move to
Subversion, and before we started committing working code to the repository, it
occurred to me that I could probably go back to the original Bricolage CVS
Repository on SourceForge, migrate that to Git, and then just
migrate from Subversion starting from the first real commit there. Then I could
just stitch the two repositories together.
From CVS to Git
Thanks to advice from IRC, I used cvs2git to build a repository from a dump
from CVS. Apparently, git cvsimport makes a lot of mistakes, while cvs2git
does a decent job keeping branches and tags where they should be. It’s also
pretty fast; once I set up its configuration and ran it, it took only around 5
minutes for it to build import files for git fast-import. It also has some
nice features to rename symbols (tags), ignore tags, assign authors, etc. I’m
aware of not tool to migrate Subversion to Git that does the same thing.
Once I had my dump, I started writing a script to import it into Git. The basic
import looks like this:
I used svn2git to convert remote branches to local tags and branches The
--no-clone option is what keeps it from doing the Subversion stuff; everything
else is the same for a new conversion from CVS. I also had to run
git reset --hard to throw out uncommitted local changes. What changes? I’m not
sure where they came from, but after the last commit is imported from CVS, all
of the local files in the master branch are deleted, but that change is not
committed. Strange, but by doing a hard reset, I reverted that change with no
harm done.
Next, I started looking at the repository in GitX, which provides a decent
graphical interface for browsing around a Git repository on Mac OS X. There I
discovered that a major benefit to importing from CVS rather than Subversion is
that, because CVS has real tags, those tags are properly migrated to Git. What
this means is that, because the Bricolage project (nearly) always tagged merges
between branches and included the name of the appropriate tag name in a merge
commit message, I was able to reconstruct the merge history in Git.
For example, there were a lot of tags named like so:
% git tag
rev_1_8_merge-2004-05-04
rev_1_6_merge-2004-05-02
rev_1_6_merge-2004-04-10
rev_1_6_merge-2004-04-09
rev_1_6_merge-2004-03-16
So if I wanted to find the merge commit that corresponded to that first tag, all
I had to do was sort the commits in GitX by date and look near 2004-05-04 for a
commit message that said something like:
Merge from rev_1_8. Will tag that branch "rev_1_8_merge-2004-05-04".
That commit’s SHA key is “b786ad1c0eeb9df827d658a81dc2d32ec6108e92”. Its
parent’s SHA key is “11dbbd49644aaa607bd83f8d542d37fcfbd5e63b”. So then all I
had to do was to tell git that there is a second parent for that commit. Looking
in GitX for the commit tagged “rev_1_8_merge-2004-05-04”, I found that its
SHA key is “4fadb117a71a49add69950eccc14b77a04c8ec68”. So to assign that as a
second parent, I write a line to the file .git/info/grafts that describes its
parentage:
Once I had all the grafts written, I just ran git filter-branch and they were
permanently rewritten to the new hierarchy.
And that’s it! The parentage is now correct. It was a lot of busy work to create
the mapping between tags and merges, but it’s nice to have it all done and
properly mapped out historically in Git. I even found a bunch merges with no
corresponding tags and figured out the proper commit to link them up to (though
I stopped when I got back to 2002 and things get really confusing). And now,
because the merge relationships are now properly recorded in Git, I can drop
those old merge tags: as workarounds for a lack of merge tracking in CVS, they
are no longer necessary in Git.
Next up, how I completed the merge from Subversion. I’ll write that once I’ve
finally got it nailed down. Unfortunately, it takes an hour or two to export
from Subversion to Git, and I’m having to do it over and over again as I figure
stuff out. But it will be done, and you’ll hear more about it here.
In preparation for migrating a large Subversion repository to GitHub, I needed
to get a list of all of the Subversion committers throughout history, so that I
could create a file mapping them to Git users. Here’s how I did it:
This week, I imported pgTAP into GitHub. It took me a day or so to wrap my
brain around how it’s all supposed to work, with generous help from Tekkub.
But I’m starting to get the hang of it, and I like it. By the end of the day, I
had sent push requests to Test::More and Blosxom Plugins. I’m well on my way
to being hooked.
One of the things I want, however, is SVN::Notify-type commit emails. I know
that there are feeds, but they don’t have diffs, and for however much I like
using NetNewsWire to feed by political news addiction, it never worked for me
for commit activity. And besides, why download the whole damn thing again, diffs
and all (assuming that ever happens), for every refresh. Seems like a hell of a
lot unnecessary network activity—not to mention actual CPU cycles.
So I would need a decent notification application. I happen to have one. I
originally wrote SVN::Notify after I had already written activitymail, which
sends noticies for CVS commits. SVN::Notify has changed a lot over the years,
and now it’s looking a bit daunting to consider porting it to Git.
However, just to start thinking about it, SVN::Notify really does several
different things:
Fetches relevant information about a Subversion event.
Parses that information for a number of different outputs.
Writes the event information into one or more outputs (currently plain text
or XHTML).
Constructs an email message from the outputs
Sends the email message via a specified method (sendmail or SMTP).
For the initial implementation of SVN::Notify, this made a lot of sense, because
it was doing something fairly simple. It was designed to be extensible by
subclassing (successfully done by SVN::Notify::Config and
SVN::Notify::Mirror), and, later, by output filters, and that was about it.
But as I think about moving stuff to Git, and consider the weaknesses of
extensibility by subclassing (it’s just not pretty), I’m naturally rethinking
this architecture. I wouldn’t want to have to do it all over again should some
future SCM system come along in the future. So, following from a private
exchange with Martijn Van Beers, I have some preliminary thoughts on how a
hypothetical SCM::Notify (VCS::Notify?) module might be constructed:
A single interface for fetching SCM activity information. There could be any
number of implementations, just as long as they all provided the same
interface. There would be a class for fetching information from Subversion,
one for Git, one for CVS, etc.
A single interface for writing a report for a given transaction. Again,
there could be any number of implementations, but all would have the same
interface: taking an SCM module and writing output to a file handle.
A single interface for doing something with one or more outputs. Again, they
can do things as varied as simply writing files to disk, appending to a
feed, inserting into a database, or, of course, sending an email.
The core module would process command-line arguments to determine what SCM
is being used any necessary contextual information and just pass it on to
the appropriate classes.
In psedudo-code, what I’m thinking is something like this:
package SCM::Notify;
sub run {
my $args = shift->getopt;
my $scm = SCM::Interface->new(
scm => $args->{scm} # e.g., "SVN" or "Git", etc.
revision => $args->{revision},
context => $args->{context} # Might include repository path for SVN.
);
my $report = SCM::Report->new(
method => $opts->{method}, # e.g., SMTP, sendmail, Atom, etc.
scm => $scm,
format => $args->{output}, # text, html, both, etc.
params => $args->{params}, # to, from, subject, etc.
);
$report->send;
}
Then a report class just has to create report in the specified format or formats
and do something with them. For example, a Sendmail report would put together a
report as a multipart message with each format in a single part, and then
deliver it via /sbin/sendmail, something like this:
package SCM::Report::Sendmail;
sub send {
my $self = shift;
my $fh = $self->fh;
for my $format ( $self->formats ) {
print $fh SCM::Format->new(
format => $format,
scm => $self->scm,
);
}
$self->deliver;
}
So those are my rather preliminary thoughts. I think it’d actually be pretty
easy to port the logic of this stuff over from SVN::Notify; what needs some more
thought is what the command-line interface might look like and how options are
passed to the various classes, since the Sendmail report class will require
different parameters than the SMTP report class or the Atom report class. But
once that’s worked out in a way that can be handled neutrally, we’ll have a much
more extensible implementation that will be easy to add on to going forward.
Any suggestions for passing different parameters to different classes in a
single interface? Everything needs to be able to be handled via command-line
options and not be ugly or difficult to use.
I successfully migrated the Kineticode Subversion repository to a new server
yesterday. Everything works great. But after my first commit, I realized that,
while my username on the old server was “theory,” on the new server it’s
“david”. Subversion works fine, of course, and I was able to start committing
from old checkouts using the new username, but I realized that sites like
Ohloh would pick up the two usernames as separate usernames. So I wanted to
update all of the 3630 existing revisions that were mine to use the new
username.
Unfortunately, I couldn’t find much on how to do this in a quick Googling. But I
quickly figured out that what I need to do was to svnadmin dump my repository,
modify the dump, and then load it again. The Subversion dump format has all
these fields for tracking the content-lengths of various, so doing the update
was a bit tricky. But I wrote the script here to track things, and it worked
great for me. So here it is for others to reference and use.
#!/usr/bin/perl -wusestrict;usewarnings;while(<>){print;nextunless/^Revision-number:\s+\d+$/;# Grab the content lengths. Examples:# Prop-content-length: 139# Content-length: 139my$plen_line=<>;my$clen_line=<>;unless($plen_line=~ /^Prop-content-length:\s+\d+$/){# Nothing we want to change.print$plen_line,$clen_line;next;}my@lines;while(<>){if( /^PROPS-END$/){# finish.print$plen_line,$clen_line,@lines,$_;last;}push@lines,$_;if( /^svn:author$/){# Grab the author content length. Example:# V 6my$alen_line=<>;# Grab the author name.my$auth=<>;if($auth=~s/^theory$/david/){# Adjust the content lengths.formy$line($plen_line,$clen_line,$alen_line){$line=~s/(\d+)$/$1 - 1/e;}}print$plen_line,$clen_line,@lines,$alen_line,$auth;last;}}}
To use it, save it to a file, say svn_author, then change line 40 to your old
and new usernames. Then, on line 43, change the $1 - 1 bit to be correct for
the difference between the usernames you’re changing. For example, if you’re
changing your username from, say, “shane” to “chromatic,” the new name is five
characters longer, so you’d make it $1 + 5.
And that’s it! Feel free to take this code and do with it what you like,
including fix any bugs, add command-line options, support changing multiple
authors at once, or whatever. Share and enjoy.
I’m very pleased to announce the release of SVN::Notify 2.70. You can see an
example of its colordiff output here. This is a major release that I’ve spent
the last several weeks polishing and tweaking to get just right. There are quite
a few changes, but the two most important are improved character encoding
support and output filtering.
Improved Character Encoding Support
I’ve had a number of bug reports regarding issues with character encodings.
Particularly for folks working in Europe and Asia, but really for anyone using
multibyte characters in their source code and log messages (and we all do
nowadays, don’t we?), it has been difficult to find the proper incantation to
get SVN::Notify to convert data from and to their proper encodings. Using a
patch from Toshikazu Kinkoh as a starting-point, and with a lot of reading and
experimentation, as well as regular and patient tests on Toshikazu’s and Martin
Lindhe’s production systems, I think I’ve finally got it nailed down.
Now you can use the --encoding (formerly --charset), --svn-encoding, and
--diff-encoding options—as well as --language—to get SVN::Notify to do the
right thing. As long as your Subversion server’s OS supports an appropriate
locale, you should be golden (mine is old, with no UTF-8 locales :\). And if
all else fails, you can still set the $LANG environment variable before
executing svnnotify.
There is actually a fair bit to know about encodings to get it to work properly,
but if you use UTF-8 throughout and your OS supports UTF-8 locales, you
shouldn’t have to do anything. You might have to set --language in order to
get it to use the proper locale. See the new documentation of the encoding
support for all the details. And if you still have problems, please do let me
know.
Output Filtering
Much sexier is the addition of output filtering in SVN::Notify 2.70. I got
pretty tired of getting feature requests for what are essentially formatting
modifications, such as this one requesting support for KDE-style keyword
support. I myself was using Trac wiki syntax in commit messages on a recent
project and wanted to see them converted to HTML for messages output by
SVN::Notify::HTML::ColorDiff.
So I finally sat down and gave some though on how to implement a simple plugin
architecture for SVN::Notify. When I realized that it was generally just
formatting that people wanted, it became simpler: I just needed a way to allow
folks to write simple output filters. The solution I came up with was to just
use Perl. Output filters are simply subroutines named for the kind of output
they filter. They live in perl packages. That’s it.
For example, say that your developers write their commit log messages in
Textile, and rather than receive them stuck inside <pre> tags, you’d like
them converted to HTML. It’s simple. Just put this code in a Perl module file:
package SVN::Notify::Filter::Textile;
use Text::Textile ();
sub log_message {
my ($notifier, $lines) = @_;
return $lines unless $notify->content_type eq 'text/html';
return [ Text::Textile->new->process( join $/, @$lines ) ];
}
Put the file, SVN/Notify/Filter/Textile.pm somewhere in a Perl library
directory. Then use the new --filter option to svnnotify to put it to work:
Yep, that’s it! SVN::Notify will find the filter module, load it, register its
filtering subroutine, and then call it at the appropriate time. Of course, there
are a lot of things you can filter; consult the complete documentation for all
of the details. But hopefully this gives you a flavor for how easy it is to
write new filters for SVN::Notify. I’m hoping that all those folks who want
features can now stop bugging me and writing their own filters to do the job,
and uploading them to CPAN for all to share!
To get things started, I scratched my own itch, writing a Trac filter myself.
The filter is almost as simple as the Textile example above, but I also spent
quite a bit of time tweaking the CSS so that most of the Trac-generated HTML
looks good. You can see an example right here. Thanks to a number of bug fixes
in Text::Trac, as well as Trac-specific CSS added via a filter on CSS output,
it works beautifully. If I’m feeling motivated in the next week or so, I’ll
create a separate CPAN distribution with just a Markdown filter and upload it.
That will create a nice distribution example for folks to copy to create their
own. Or maybe someone on the Lazy Web Will do it for me! Maybe you?
I wish I’d thought to do this from the beginning; it would have saved me from
having to add so many features/cruft to SVN::Notify over the years. Here’s a
quick list of the features that likely could have been implemented via filters
instead of added to the core:
--user-domain: Combine the SVN username with a domain for the “From”
header.
--add-header: Add a header to the message.
--reply-to: Add a specific header to the message.
SVN::Notify::HTML::ColorDiff: Frankly, looking back on it, I don’t know why
I didn’t just put this support right into SVN::Notify::HTML. But even if I
hadn’t, it could have been implemented via filters.
--subject-prefix:: Modify the message subject.
--subject-cx: Add the commit context to the subject.
--strip-cx-regex: More subject context modification.
--no-first-line: Another subject filter.
--max-sub-length: Yet another!
--max-diff-length: A filter could truncate the diff, although this might
be tricky with the HTML formatting.
--author-url: Modify the metadata section to add a link to the author URL.
--revision-url: Ditto for the revision URL.
--ticket-map: Filter the log message for various ticketing system strings
to convert to URLs. This also encompasses the old --rt-url,
--bugzilla-url, --gnats-url, and --jira-url options.
--header: Filter the beginning of the message.
--footer: Filter the end of the message.
--linkize: Filter the log message to convert URLs to links for HTML
messages.
--css-url: Filter the CSS to modify it, or filter the start of the HTML to
add a link to an external CSS URL.
--wrap-log: Reformat the log message for HTML.
Yes, really! That’s about half the functionality right there. I’m glad that I
won’t have to add any more like that; filters are a much better way to go.
So download it, install it, write some filters, get your multibyte characters
output properly, and enjoy! And as usual, send me your bug reports, but implement your own improvements using filters!
So I finally got ‘round to porting SVN::Notify to Windows. Version 2.57 is
making is way to CPAN right now. The solution turned out to be dead simple: I
just had to use a different form of piping open() on Windows, i.e.,
open FH, "$cmd|" instead of open FH, "-|"; exec($cmd);. It’s silly, really,
but it works. It really makes me wonder why -| and |- haven’t been emulated
on Windows. Whatever.
‘Course the other thing I realized, after I made this change and all the tests
pass, was that there is no equivalent of sendmail on Windows. So I added the
--smtp option, so that now email can be sent to an SMTP server rather than to
a local sendmail. I tested it out, and it seems to work, but I’d be especially
interested to hear from folks using wide characters in their repositories: do
they get printed properly to Net::SMTP’s connection?
The whole list of changes in 2.57 (the output remains the same as in 2.56):
Finally ported to Win32. It was actually a simple matter of changing how
command pipes are created.
Added --smtp option to enable sending messages to an SMTP server rather
than to the local sendmail application. This is essential for Windows
support.
Added --io-layer to the usage statement in svnnotify.
Fixed single-dash arguments in documentation so that they’re all documented
with a single dash in SVN::Notify.
I’ve just uploaded SVN::Notify 2.56 to CPAN. Check a mirror near you! There
have been a lot of changes since I last posted about SVN::Notify (for the 2.50
release), not least of which is that SourceForge has standardized on it for
their Subversion roll out. W00t! The result was a couple of patches from
SourceForge’s David Burley to add headers and footers and to truncate diffs over
a certain size. See the sample output for how it looks. Thanks, David!
The change I’m most pleased with in 2.56 is the addition of
SVN::Notify::Alternative, based on a submission from Jukka Zitting. This new
subclass allows you to actually combine a number of other subclasses into a
single activity notification message. Why? Well, mainly because, though you
might like to get HTML messages with colorized diffs, some mail clients might
not care for the HTML. They would much prefer the plain text version.
SVN::Notify::Alternative allows you to have your cake and eat it too: send a
single message with multipart/alternative sections for both HTML output and
plain text. Plain text will always be used; to use HTML::ColorDiff with it, just
do this:
This incantation will send an email with both the plain text and HTML::ColorDiff
formats. If you look at it in Mail.app, you’ll see the nice colorized format,
and if you look at it in pine, you’ll see the plain text.
For the curious, here are all of the changes since 2.50:
2.56 2006-04-04T23:16:37
Abstracted creation of the diff file handle into the new diff_handle()
method.
Documented use of diff_handle() in the output() method.
Added optional second argument to output() to optionally suppress the
output of the email headers. This argument is used by the new
Alternative subclass.
Added SVN::Notify::Alternative, which allows multiple versions of a
commit email to be sent, such as text/plain plus HTML. The multiple
versions are assembled into a single email message using the
multipart/alternative media type. For those who want HTML messages but
must support users that can only read plain text or rely on archives
that ignore HTML messages, this can be very useful. Based on an
implementation by Jukka Zitting.
Fixed use_ok() tests that weren’t running at all.
Added an extra newline to separate the file list from an inline diff in
the plain text format where --with-diff has been specified.
Moved the multipart/mixed content-type header generation from
output_headers() to output_content_type(), not only because this
makes more sense, but also because it makes attachments behave better
when using SVN::Notify::Alternative.
Documented accessors in SVN::Notify::HTML.
2.55 2006-04-03T23:11:11
Added the io-layer option to specify an alternate IO layer. Will be
most useful for those with repositories containing text in multiple
encodings, where it should be set to “raw”.
Fixed the context output in the subject for the --subject-cx option so
that it’s smarter about determining the longest common path. Reported by
Max Horn.
No longer modifying the values of the to_regex_map hash, so as not to
mess with folks who might be passing it as a hash to more than one call
to new(). Reported by Darby Felton.
Added a meta http-equiv="content-type" tag to HTML output that
includes the character set to help some clients in the proper display of
the characters in an HTML email. I’m not sure if any clients actually
need this help, but it certainly can’t hurt!
Added the --css-url option to specify an alternate style sheet for
HTML emails. SVN::Notify::HTML’s own CSS is left in the email, as well,
so the specified style sheet can just override the default, rather than
have to style everything itself. Yes, it takes advantage of the
“cascading” feature of cascading style sheets! Based on a suggestion by
Steve James.
2.54 2006-03-06T00:33:42
Added /usr/bin to the list of paths searched for executables.
Suggested by Nacho Barrientos.
Added --max-diff-length option. Patch from David Burley/SourceForge.
2.53 2006-02-24T21:30:48
Added header and footer attributes and command-line options to
specify text to be put at the head and foot of each message. For HTML
messages, the text will be escaped, unless it starts with “<”, in which
case it will be assumed to be valid HTML and will therefore not be
escaped. Either way, it will be output between <div> tags with the IDs
“header” or “footer” as appropriate. Based on a patch from David
Burley/SourceForge.
Fixed the executable-searching algorithm added in 2.52 to add “.exe” to
the name of the executable being searched for if $^O eq 'MSWin32'.
Fixed encoding issues so that, under Perl 5.8 and later, the IO layer is
set on file handles so as to encode input and decode output in the
character set specified by the charset attribute. CPAN # 16050,
reported by Michael Zehrer.
Added a second argument to all calls to encode_entities() in
SVN::Notify::HTML and SVN::Notify::HTML::ColorDiff so that only ‘>’.
‘<’, ‘&’, and ‘"’ are escaped.
Fixed a bug in the _find_exe() function that was attempting to modify
a constant variable. Patch from John Peacock.
Turned the _find_exe() function into the find_exe() class method,
since subclasses (such as SVN::Notify::Mirror) might want to use it.
2.52 2006-02-19T18:50:24
Now uses File::Spec->path to search for a validate sendmail or
svnlook when they’re not specified via their respective command-line
options or environment variables. Suggested by Andreas Koenig. Not that
they should probably be explicitly set anyway, as the $PATH
environment variable tends to be non-existent when running under Apache.
2.51 2006-01-02T23:28:11
Fixed ColorDiff HTML to once again be valid XHTML 1.1.
So SVN::Notify doesn’t currently run on Windows. Why not? Well, because I
wanted to do things as “rightly” as possible. In terms of efficiency, what that
meant was, rather than slurping in whole chunks of data, such as diffs, from
svnlook, I instead follows the guidance in perlipc to open a file handle
pipe to svnlook and then read from it line-by-line. The method I wrote to
create the pipe looks like this:
sub_pipe{my($self,$mode)=(shift,shift);# Safer version of backtick (see perlipc(1)).local*PIPE;my$pid=open(PIPE,$mode);die"Cannot fork: $!\n"unlessdefined$pid;if($pid){# Parent process. Return the file handle.return*PIPE;}else{# Child process. Execute the commands.exec(@_)ordie"Cannot exec $_[0]: $!\n";# Not reached.}}
The problem is that it doesn’t work on Windows. perlipc says:
Note that these operations are full Unix forks, which means they may not be
correctly implemented on alien systems. Additionally, these are not true
multithreading. If you’d like to learn more about threading, see the modules
file mentioned below in the SEE ALSO section.
‘Course, the SEE ALSO section doesn’t have much of for “alien systems,” but I
have a comment in my code that suggests that Win32::Process might do for
Windows compatibility. But I honestly don’t know.
So what’s the best approach for me to port SVN::Notify to Windows while keeping
file handle pipes around for efficiency? Anyone care to take a stab at it, with
tests for Winows, and send me a patch?
SVN::Notify 2.50 is currently making its way to CPAN. It has quite a number of
changes since I last wrote about it here, most significantly the slick new CSS
treatment introduced in 2.47, provided by Bill Lynch. I really like the look,
much better than it was before. Have a look at the
SVN::Notify::HTML::ColorDiff output to see what I mean. Be sure to make your
browser window rally narrow to see how all of the sections automatically get a
nice horizontal scrollbar when they’re wider than the window. Neat, eh? Check
out the 2.40 output for contrast.
Here are all of the changes since the last version:
2.50 2005-11-10T23:27:22
Added --ticket-url and --ticket-regex options to be used by those
who want to match ticket identifers for systems other than RT, Bugzilla,
GNATS, and JIRA. Based on a patch from Andrew O’Brien.
Removed bogus use lib line put into Makefile.PL by a prerelease
version of Module::Build.
Fixed HTML tests to match either “’” or “'”, since HTML::Entities
can be configured differently on different systems.
2.49 2005-09-29T17:26:14
Now require Getopt::Long 2.34 so that the --to-regex-map option works
correctly when it is used only once on the command-line.
2.48 2005-09-06T19:14:35
Swiched from <span class="add"> and <span class="rem"> to <ins>
and <del> elements in SVN::Notify::HTML::ColorDiff in order to make
the markup more semantic.
2.47 2005-09-03T18:54:43
Fixed options tests to work correctly with older versions of
Getopt::Long. Reported by Craig McElroy.
Slick new CSS treatment used for the HTML and HTML::ColorDiff emails.
Based on a patch from Bill Lynch.
Added --svnweb-url option. Based on a patch from Ricardo Signes.
2.46 2005-05-05T05:22:54
Added support for “Copied” files to HTML::ColorDiff so that they display
properly.
2.45 2005-05-04T20:38:18
Added support for links to the GNATS bug tracking system. Patch from
Nathan Walp.
2.44 2005-03-18T06:10:01
Fixed Name in POD so that SVN::Notify’s POD gets indexed by
search.cpan.org. Reported by Ricardo Signes.
2.43 2004-11-24T18:49:40
Added --strip-cx-regex option to strip out parts of the context from
the subject. Useful for removing parts of the file names you might not
be interested in seeing in every commit message.
Added --no-first-line option to omit the first sentence or line of the
log message from the subject. Useful in combination with the
--subject-cx option.
2.42 2004-11-19T18:47:20
Changed “Files” to “Paths” in hash returned by file_label_map() since
directories can be listed as well as files.
Fixed SVN::Notify::HTML so that directories listed among the changed
paths are not links.
Requiring Module::Build 0.26 to make sure that the installation works
properly. Reported by Robert Spier.
I expect that this will be my last release of SVN::Notify for a while. I’ve
already spent more time on it than I had anticipated. But anyway, this is a
pretty solid release. It doesn’t change the API or anything, but I feel that the
jump from 2.30 to 2.40 is justified because of the sheer number of changes. From
now on, I expect that it will mostly be maintenance, like 2.41, which fixes a
minor formatting bug. Grab it now from CPAN.
First, I’ve added a new, complex example of the SVN::Notify::HTML::ColorDiff
output that I will keep up-to-date with all future changes. This will allow
people to get a better idea of what it’s capable of than my previous contrived
examples allowed.
The biggest change is that I’ve moved the Request Tracker, Bugzilla, and
JIRA support from SVN::Notify::HTML to SVN::Notify. I realized, after the
release of 2.30, that it might be cool to add links to the text-only email
message generated by SVN::Notify, too. So I’ve done that, including for ViewCVS
links. Unlike in SVN::Notify::HTML, the links won’t be inline in the message
(that doesn’t work too well in plain text, IMO), but will come in their own
sections after the message. So you’ll get something like this (extreme example):
Log Message:
-----------
Let's try a few links to other applications. First, we have
A Bugzilla Bug # 709. Then we have a JIRA key, TST-1608. And
finally, we have an RT link to Ticket # 4321.
Hey, we could add one to ViewCVS for a Subversion Revision
#606, too!
ViewCVS Links:
-------------
http://viewsvn.bricolage.cc/?rev=606&view=rev
Bugzilla Links:
--------------
http://bugzilla.mozilla.org/show_bug.cgi?id=709
RT Links:
--------
http://rt.cpan.org/NoAuth/Bugs.html?id=4321
JIRA Links:
----------
http://jira.atlassian.com/secure/ViewIssue.jspa?key=TST-1608
The nice thing is that, for many mail clients, these will be turned into
clickable links. You’ll also notice that the text that creates the ViewCVS link
is split over two lines. This is new in this release, and works for
SVN::Notify::HTML, too. I made a few other tweaks to the regular expressions, as
well. Here’s a complete list of changes:
Fixed accessor generation so that accessors created for the attributes
passed to register_attributes() but a subclass are created in the
subclass’ package instead of in SVN::Notify.
Changed parsing for JIRA keys to use any set of capital letters followed by
a dash and then a number, rather than the literal string “JIRA-” followed by
a number. Reported by Garrett Rooney.
Modified the regular expression patterns for the RT, Bugzilla, RT, and
ViewCVS links to properly match on word boundaries, so that strings like
“humbug 12” don’t match.
Modified the ViewCVS link regular expression pattern so that it matches
strings like “rev 12” as well as “revision 12”.
Modified the RT link regular expression pattern so that it matches strings
like “RT-Ticket: 23” as well as “Ticket 1234”. Suggested by Jesse Vincent.
Added complicated example to try to show off all of the major features. I
will keep this up-to-date going forward in order to post sample output on
the Web.
Fixed the parsing of log messages so that empty lines are no longer
eliminated.
HTML::ColorDiff now properly handles the listing of binary files in the
diff, marking them with a new class, “binary”, and using the same CSS as is
used for the “propset” class.
In HTML::ColorDiff, Fixed CSS for the “delfile” class to properly wrap it in
a border like the other files in the diff.
Added labels to the HTML::ColorDiff diff file sections to indicate the type
of change (“Modified”, “Added”, “Deleted”, or “Property changes”).
Moved the rt_url, bugzilla_url, and jira_url parameters from
SVN::Notify::HTML to SVN::Notify, where they are used to add URLs to the
text version of log messages.
I released a new version of SVN::Notify last night, 2.30. This new version has
a few things going for it.
First, and most obviously from the point of view of users of the HTML subclass,
I’ve added new options for specifying Request Tracker, Bugzilla, and JIRA
URLs. The --rt-url, --bugzilla-url, and --jira-url options have an effect
much like the parallel feature in CVSspam: pass in a string with the spot
for the ID represented by %s, such as
http://rt.cpan.org/NoAuth/Bugs.html?id=%s for RT or
http://bugzilla.mozilla.org/show_bug.cgi?id=%s for Bugzilla. SVN::Notify::HTML
will then look for the appropriate strings (such as “Ticket # 1234” for RT or
“Bug # 4321” for Bugzilla) and turn them into URLs.
This functionality has been extended to the old --viewcvs-url option, to. For
the sake of consistency, it now also requires a URL of the same form (although
if SVN::Notify doesn’t see %s in the string, it will append a default and emit
a warning), and will be used to create links for strings like “Revision 654” in
the log message.
SVN::Notify::HTML has an additional new option, --linkize, that will force any
email addresses or URLs it finds in the log message to be turned into links.
Again, this works like it does for CVSspam; I’m grateful to Jeffrey Friedl’s
Mastering Regular Expressions, Second Edition for the excellent regular
expressions for matching URLs and email addresses.
All of this was made possible by moving the processing of options from
svnnotify to SVN::Notify->get_options and adding a new class method,
SVN::Notify->register_attributes. This second method allows Bricolage
subclasses to easily add new attributes; register_attributes() will create
accessor methods and add command-line option processing for each new attribute
required by a subclass. Then, when you execute svnnotify --handler HTML,
SVN::Notify->get_options processes the default options, loads the
SVN::Notify::Handler subclass, and then processes any options specified by the
subclass. The short story is that all of this is the detail-oriented way of
saying that it is easier to subclass SVN::Notify and be able to automatically
load the necessary options and attributes via the same executable, svnnotify.
This change was motivated not only by my desire to add the new features to
SVN::Notify::HTML, but also by Autrijus’ new modules, SVN::Notify::Snapshot
and SVN::Notify::Config. Thanks Autrijus!
I’ll try to get a nice example of all this functionality up in the next few
days; if anyone else creates one first, send it to me! But in the meantime,
enjoy!