Adding tab completion for perlbrew

perlbrew is a command-line utility I’m using quite a bit recently: while developing Bencher feature of benchmarking against multiple perls, for trying out cperl, or just updating to the latest perl release. So I thought it would be nice to add tab completion feature to perlbrew.

The obvious choice (for many people anyway) to write tab completion feature in is bash, but I’m more comfortable with Perl. And besides, there are a few nice completion features in Complete::Util I’d like to use.

The result is App::ShellCompleter::perlbrew. You install it by first installing App::shcompgen from CPAN and then:

% shcompgen init

then install App::ShellCompleter::perlbrew from CPAN.

Some of the things that the completion can do:

Complete subcommands, option names, option values, arguments

For example:

% perlbrew un<tab>

will complete to:

% perlbrew uninstall _

The completion features “word-mode” matching, so you can also do something like this:

% perlbrew i-cp<tab>

and it will complete to:

% perlbrew install-cpanm _

Display the list of available perls to install

% perlbrew install <tab>

The first time you do this, it will take several seconds because the completion script will fetch the list of available perls from “perlbrew available”. After that it should be instantaneous because the completion script caches the result in a temporary file.

Display the list of installed perls

It can also do “char-mode” or “fuzzy” matching for increased convenience. For example, type this:

% perlbrew switch 10<tab>

and it will complete to (assuming you have perl 5.10.1 installed):

% perlbrew switch 5.10.1

Source code

The source code for _perlbrew is about 300 lines and I believe is fairly easy to write.

Advertisement

lcpan 0.67: scripts, mentions, contents

lcpan is an application to manage your local CPAN mirror. It downloads mini CPAN to your computer, and creates a SQLite database from information in the mirror so you can query various stuffs about your mirror.

Up until version 0.66, lcpan only indexes modules/02packages.details.txt.gz (for list of modules/packages), authors/01mailrc.txt.gz (list of authors), and META files from each release tarballs (dist names, dist abstract, dependency information).

Now in 0.67, a lot more stuffs are being indexed. Size and mtime of release tarballs, content (list of files) inside each tarball, scripts. lcpan also extracts abstract for each module and script. It also parses POD to get mentions, which is references to modules/scripts.

So now in addition to dependencies relationship, we also get mentions relationship. Want to know what modules/scripts/authors mentions your modules in their POD? Or what modules and authors are most popular in terms of being most mentioned? Now you can.

A note about String::Flogger and logging in dzil

Dist::Zilla uses Log::Dispatchouli for logging, which in turn uses String::Flogger to format arguments into the final string that gets logged.

String::Flogger is a convenient formatter. Depending on what arguments it gets, it does different things. First, if passed a simple string then it is returned as-is, so you can just do this:

$self->log("A simple message");

If you pass several strings (arguments), each string (argument) will be flogged individually and joined with space, e.g.:

$self->log("A message", "another message"); # final string is: "A message another message"

Second, if you pass an array reference, it will be formatted using (currently) sprintf(), e.g.:

$self->log(["A %s message", "sprintf-formatted"]); # "A sprintf-formatted message"

Note that in some other logging frameworks, like Log::Any, there is a separate set of methods to explicitly request sprintf-style formatting, so no extra pair of brackets is needed, e.g.:

$log->debugf("A %s message", "sprintf-formatted");

Third, if passed a coderef, String::Flogger calls the code. This can be useful if you want to do more complex formatting and/or defer relatively expensive calculation, e.g.:

$self->log(sub { "Current system load: ".sysload() });

So far so good.

But wait, there’s an extra level of convenience and design choice. In the array reference (sprintf-style) variant, if the sprintf arguments are references, they will be formatted using some rules:

1) scalar references or reference references will be formatted using “ref(VALUE)”, so \1 will become:

ref(1)

2) the rest will be passed to JSON encoder (currently JSON::MaybeXS) and then enclosed using “{{” and “}}”, e.g.:

$self->log(["Let there be some data: %s", [undef, "a"]]); # 'Let there be some data: {{[null,"a"]}}'

The choice of JSON indicates (also confirmed here) that RJBS envisions the log to also be processed with external tools outside the Perl ecosystem. A valid use case, but with some drawbacks that some Perl data structures cannot be viewed more accurately, which you also sometimes want when you are debugging some complex Perl data in your application.

Objects (blessed references) will be dumped as just “obj(ADDRESS)”.

Also, the extra convenience is that in the arrayref/sprintf-style variant, if the arguments are coderefs, they will also be called, e.g.:

$self->log(["Current system load: %s", sub { sysload() }];

This is a potential gotcha if you, like me, are used to the other logging frameworks. If you want a behaviour more like Log::Any:

$log->debugf("Data that might be anything including coderef: %s", $data);

In Log::Dispatchouli you should dump the data first using something else, e.g.:

$self->log(["Data that might be anything including coderef: %s", Data::Dmp::dmp($data)]);

or perhaps:

$self->log(["Data that might be anything including coderef: %s", sub { require Data::Dmp; Data::Dmp::dmp($data) }]);

Interacting with PAUSE using CLI

Any CPAN author has to interact with PAUSE, the website you go to to upload files if you want to publish your work on CPAN. There is no API provided, so you have to use a browser to upload files manually.

Well, not really. There are some modules you can use, like CPAN::Uploader to upload files or WWW::PAUSE::CleanUpHomeDir to delete old releases in your PAUSE home directory. And if you use Dist::Zilla, by default you will use CPAN::Uploader when you release your distribution, so you don’t have to go to PAUSE manually. These modules all work by scraping the website since, like it is said above, there is no API.

WWW::PAUSE::Simple is another module you can use which: 1) provides more functions (aside from uploading, currently can also list/delete/undelete/reindex files, as well as list distributions and cleanup older releases, more functions will be added in the future); 2) comes with a handy CLI utility called pause (distributed in App::pause) to do everything via the command-line.

To use this utility, first install the CPAN module:

If you want the Perl API (module):

% cpanm -n WWW::PAUSE::Simple

If you want the CLI:

% cpanm -n App::pause

After that, configure it by creating a file ~/pause.conf (or ~/.config/pause.conf):

username=(your PAUSE ID)
password=(your password)

and you’re ready to go.

Uploading files. To upload some files:

% pause upload *.tar.gz

This is the subcommand I use most often. Even though I use Dist::Zilla daily, there are routinely times when I need to upload manually, for example when I develop on my laptop in places with no or flaky Internet connection. I still do the releases, but skip uploading to CPAN. Later when Internet connection is available again, I upload the tarballs using pause.

To see log/debug messages as each file is uploaded, give it a --debug option.

This utility is very much similar to the cpan-upload script provided by CPAN::Uploader, except by default pause will continue to the next file when uploading a file fails, instead of bailing out (cpan-upload can already behave like this now, by using the new --ignore-errors option).

Listing files. Simple enough:

% pause ls
% pause ls -l

You can also give it some wildcard arguments to match the files. But since matching is done by the script to match files on the server, you have to quote the wildcards to prevent them from being expanded by the shell to match files on your local directory:

% pause ls -l 'App-*'

Since some files can also be scheduled for deletion if they have been recently deleted, you can filter whether you want to see these files or not via --nodel. To see only files scheduled for deletion, use --del.

Deleting files. Also using subcommand familiar to Unix fans:

% pause rm 'App-*' '*TRIAL*'
% pause rm '*'; # delete everything!

Again, you should quote your wildcard arguments to protect them from being accidentally expanded by the shell.

When deleting files on PAUSE, the files are not actually deleted immediately but instead put into a schedule of 72 hours. During this period, you can cancel your deletion instruction by using the undelete subcommand:

% pause undelete '*'; # bring back everything

Once the period expires, the files will actually be deleted and cannot be recovered anymore on PAUSE. (They are still available on BackPAN though, so you can upload them again to PAUSE if you want to.)

Reindexing files. Sometimes you have to reindex your files so it can appear on the indexes (02packages.details.txt.gz et al). For example, if you happen to lack permission during the first upload (you were not yet given a co-maint status for a module). Or sometimes the PAUSE indexer does choke up. This happens to me once or twice, usually because I upload too many files at once causing the SQLite database to get locked and the indexer to fail. To do this:

% pause reindex 'App-*'
% pause reindex '*'; # reindex everything!

Cleaning older releases. I’ve used WWW::PAUSE::CleanUpHomeDir for a few years, but since I’ve written pause last month, I thought why not add this functionality too into the tool. So:

% pause cleanup

will delete older releases on your PAUSE home directory. Trial/dev releases are skipped. There is an option -n to let you specify how many old versions you want to keep for each distribution. The default is 1, meaning to only keep the latest version.

I now do this routine every week or so to keep my home dir clean.

The other stuffs. I also plan to add subcommands for other PAUSE functionality, like changing password, setting forwarding email, and setting permissions. But since I do this very seldomly, I haven’t bothered to add them yet.

Dry-run mode. Some subcommands, especially those which can modify your files like rm and reindex, have a --dry-run option to let you try without actually do the action. They only will show the files that will be affected. Useful particularly if you use wildcards or with the cleanup subcommand.

Tab completion. pause comes with tab completion feature under Unix shells, which you can activate via complete -C pause pause (or via shcompgen, see the manpage for more details).

I haven’t tested this under Windows, please drop me a message if things don’t work as advertised.

But what about the API? Aside from the CLI, WWW::PAUSE::Simple is the module you can use from Perl directly. For example:

use WWW::PAUSE::Simple qw(list_dists);
my @dists = list_dists(username=&gt;&quot;your PAUSE ID&quot;, password=&gt;&quot;your pass&quot;);

Couldn’t be simpler.

Additional reading

pericmd 004: What’s wrong with Getopt::Long (2)

In the previous post I mentioned about some things I would like to change in Getopt::Long, but this is rather opinionated and I’m not sure they should be implemented. Here goes:

Some old crufts. Since this is a very old library, there are some features that are very seldom needed these days that they make good candidates for removal, to make the library smaller and easier to use. The first one is the ignore_case configuration. I always use no_ignore_case nowadays and it’s common for programs to have options that differ only in case, like -v (for version) vs -V (for verbose). I think it’d just be less confusing if Getopt::Long is always case sensitive.

Another one is the getopt_compat configuration. Any program in circulation you saw lately which uses + prefix for options, instead of --? The world has mostly standardized on the dashes.

Trapping errors in option handler. Getopt::Long encloses an eval { } when executing option handler. I think there should be an option to disable this, so an option handler can make the whole option parsing process exit early. Currently this is not possible, due to the error trapping.

Well, that’s about it. Much shorter than I expected 🙂

pericmd 002: Getopt::Long

Getopt::Long (cpanratings) is an established Perl module for parsing command-line options. A bit of history (via some googling): it has been included as part of the Perl distribution since perl-5.0.0 back in 1994. It was actually created as newgetopt.pl during the pre-module days of perl 4.x. The creator is Netherlands-based Johan Vromans (CPAN ID: JV) which also wrote the Perl Pocket Reference (one among the well-respected books, if I remember). Amazingly, after 20+ strong years, Johan still maintains Getopt::Long to this very day. So, deep kudos for that.

Why the ::Long in the name, you might ask? Because before newgetopt.pl, there is actually a Perl 4 library called getopt.pl which became Getopt::Std in Perl 5. It can only handle single-letter options, so we will not consider using it in this series.

Here’s a typical CLI program written using Getopt::Long, taken from my scripts repository:

#!/usr/bin/env perl

use 5.010;
use strict;
use warnings;

use Getopt::Long;

my %opts = (parts =&gt; [&quot;/backup&quot;], all=&gt;0, percent_used=&gt;95);

GetOptions(
    &quot;part=s&quot;             =&gt; $opts{parts},
    &quot;all|a&quot;              =&gt; \$opts{all},
    &quot;percent-used|pct=i&quot; =&gt; \$opts{percent_used},
    &quot;help|h|?&quot;           =&gt; \$opts{help},
);

if ($opts{help}) {
    print &lt;&lt;_;
Usage: $0 [opts]
Options:
  --part=S  Add partition S to warn, eg. --part=/home --part=/
  --all     Warn all partitions (except tmpfs)
  --percent-used=N  Warn if percentage of usage exceeds N (default: 95)

_
    exit 0;
}

my $i;
for (`df -h`) {
    next unless $i++;
    chomp;
    my ($fs, $blocks, $used, $avail, $pctuse, $mp) = split /\s+/, $_;
    $pctuse =~ s/%//;
    #say &quot;DEBUG: mp=$mp, pctuse=$pctuse&quot;;

    next if $fs =~ /^(tmpfs)$/;
    if (!$opts{all}) { next unless $mp ~~ $opts{parts} }
    next unless $pctuse &gt;= $opts{percent_used};
    say &quot;Disk usage in $mp has exceeded $pctuse% (only $avail left)!&quot;;
}

As you can see, using the module is pretty straightforward: you just specify a hash of options specification and pass it to the GetOptions() function, which will search the options in @ARGV. The function will modify @ARGV and strip all the known options, leaving only unknown options and arguments.

The only thing you’ll need to familiarize with is basically the options specification. Short of a formal grammar, you might want to glance the syntax of the options spec via the regex specified in Getopt::Long::Util, copy-pasted here for convenience:

    $optspec =~ qr/\A
               (?:--)?
               (?P&lt;name&gt;[A-Za-z0-9_][A-Za-z0-9_-]*)
               (?P&lt;aliases&gt; (?: \| (?:[^:|!+=:-][^:|!+=:]*) )*)?
               (?:
                   (?P&lt;is_neg&gt;!) |
                   (?P&lt;is_inc&gt;\+) |
                   (?:
                       =
                       (?P&lt;type&gt;[siof])
                       (?P&lt;desttype&gt;|[%@])?
                       (?:
                           \{
                           (?: (?P&lt;min_vals&gt;\d+), )?
                           (?P&lt;max_vals&gt;\d+)
                           \}
                       )?
                   ) |
                   (?:
                       :
                       (?P&lt;opttype&gt;[siof])
                       (?P&lt;desttype&gt;|[%@])
                   ) |
                   (?:
                       :
                       (?P&lt;optnum&gt;\d+)
                       (?P&lt;desttype&gt;|[%@])
                   )
                   (?:
                       :
                       (?P&lt;optplus&gt;\+)
                       (?P&lt;desttype&gt;|[%@])
                   )
               )?
               \z/x
                   or return undef;

So, an option is more or less a “word” which can be followed by zero or more aliases. An alias is usually a single letter (like “a” for “all”) or a shorter version of the option (like “percent-used” vs “pct”), but can also be some non-alphanumeric characters (like “?” as alias for “help”).

If an option requires a value, you can add “=” followed by one of these letters: s (for string), i (for int), f (for float), o (nowadays seldom used?). Getopt::Long actually allows you to specify optional value, e.g. by specifying “debug:i” instead of “debug=i”, so user can specify either of these options: –debug, –debug=9 (or –debug 9).

Getopt::Long even allows you to specify multiple values, e.g. “rgb-color=i{3}” so user can specify options like: –rgb-color 255 255 0. However, you can’t do something like this (specifying different type for each value): “person-age=s,i”. I seldom use this feature, though.

There are other features of Getopt::Long, some very useful, some rarely used nowadays. We will cover this in the next posts because Perinci::CmdLine uses Getopt::Long for command-line options parsing.

Compared to other (newer) options processing modules on CPAN that reinvent the wheel, chances are Getopt::Long is still your best bet. It’s battle tested, comes with Perl 5 out of the box, and already does most of the common needs for options processing. Most of the other options processing modules don’t even do the two things that we usually take for granted when using common Unix/GNU utilities: short option bundling and auto abbreviation.

Short option bundling is a feature that lets you bundle two or more single-letter options in a single command-line argument. For example, when using rsync I usually write: rsync -avz … instead of rsync -a -v -z … The short options must not take values, except for the last one.

Auto abbreviation is a feature that lets you type only partial (the first letters of the) option names, as long as it is unambiguous. For example, if you have an option called –process-directories, you can just specify –process-dir or even –process, as long as it’s unambiguous with the other existing options. This is convenient.

In the next post I’ll write about what’s wrong with Getopt::Long.