Getopt modules 04: Getopt::Compact

About this mini-article series. Each day for 24 days, I will be reviewing a module that parses command-line options (such module is usually under the Getopt::* namespace). First article is here.

Getopt::Compact is a module that was first released in 2004 by Andrew Stewart Williams (ASW) and last updated in 2006. It manages to have 7 CPAN distributions depending on it.

Like Getopt::Long::Descriptive, this module is a wrapper for Getopt::Long mainly to allow users to specify summary string for each option as that’s what is lacking in Getopt::Long to produce a useful usage/help message. Getopt::Compact also tries to present a different interface that claims to be more compact if you have a lot of (flag) options. For example, this code using Getopt::Long:

GetOptions(
    "--flag1" => \$opts{flag1},
    "--flag2" => \$opts{flag1},
    "--flag3" => \$opts{flag1},
    "--flag4" => \$opts{flag1},
    "--val1|1=s"  => \$opts{val1},
    "--val2|2=i"  => \$opts{val2},
    "--val3=s@"   =>  $opts{val3},
);

will become like this when using Getopt::Compact:

my $opts = Getopt::Compact->new(
    modes  => [qw/flag1 flag2 flag3 flag4/],
    struct => [
        [["val1", "1"], "Value1 blah blah"],
        [["val2", "2"], "Value2 blah blah", "=i"],
        ["val3"       , "Value3 blah blah", "=s@"],
    ],
)->opts;

But if you always put option values into hash elements (instead of sometimes assigning an option handler), GetOptions provides an alternative interface in which you specify hashref as first argument. This makes for a more compact syntax:

GetOptions(
    \%opts,
    [qw/--flag1 --flag2 --flag3 --flag4
        --val1|1=s --val2|2=i --val3=s@/],
);

So basically what Getopt::Compact makes you do is specifying option in split parts: –name|a=s@ in Getopt::Long becomes:

[[“name”,”a”], “summary”, “=s@”]

I recommend using Getopt::Long::Descriptive (GLD) instead of this because: 1) the interface is slightly nicer (no split option specification so more familiar to Getopt::Long users); 2) GLD allows specifying default value for options; 3) GLD allows expressing that an option is required.

Getopt modules 03: Getopt::Long::Descriptive

About this mini-article series. Each day for 24 days, I will be reviewing a module that parses command-line options (such module is usually under the Getopt::* namespace). First article is here.

Note that from this day on, all the reviewed modules are non-core since only Getopt::Std and Getopt::Long are core modules. The choice of using these modules must take into account this factor, as your user must bear an additional cost of installing the module from CPAN (unless your application bundles the module).

Some of these modules are wrappers for Getopt::Long, either because the author wants to offer a different interface and/or add some missing features. Some of the modules are higher-level: they are more than mere option parsing modules, usually a CLI framework.

Among the missing features often added is the ability to generate usage message (and the other common one is the ability to parse commands/subcommands). When using Getopt::Long, one already specifies a list of options. But there is no way to add a summary string for each option, making it impossible to create a useful/nice usage message. The modules solve this problem either by allowing user to specify the per-option summary string, or using/parsing user-supplied usage text/POD.

Getopt::Long::Descriptive is one such module: it allows you to specify per-option summary string, as well as default value for an option and whether an option is required. Judging from the number of reverse dependencies, Getopt::Long::Descriptive is the fourth most popular option parsing module on CPAN with 64 reverse dependencies (after Getopt::Long with 1127, Getopt::Std with 167, and MooseX::Getopt with 134). I also have actually reviewed Getopt::Long::Descriptive in one of my Perinci::CmdLine tutorial posts.

Aside my minor nitpick as described in the linked post, there are two additional notes: Getopt::Long::Descriptive depends on another non-core module Sub::Exporter, and its startup is ~twice that of Getopt::Long:

| participant               | time (ms) | mod_overhead_time (ms) |
|—————————+———–+————————|
| Getopt::Long::Descriptive | 36 | 33.9 |
| Getopt::Long | 15 | 12.9 |
| Getopt::Std | 3.8 | 1.7 |
| perl -e1 (baseline) | 2.1 | 0 |

Not that this should be a concern to most. If you use Getopt::Long, Getopt::Long::Descriptive is pretty recommended.

Tab completion. if you have a Getopt::Long::Descriptive-based CLI script, your users can now also use shcompgen to get tab completion, because shcompgen now supports detecting Getopt::Long::Descriptive-based scripts and activating tab completion for such scripts. In shells like fish and zsh, the description for each option will even be shown.

Getopt modules 02: Getopt::Std

About this mini-article series. Each day for 24 days, I will be reviewing a module that parses command-line options (such module is usually under the Getopt::* namespace). First article is here.

Getopt::Std is the other core module that comes with Perl when it comes to parsing command-line options. The problem is, it only supports one-letter options. Since Getopt::Long also supports one-letter options, there is really little or no reason for you to use Getopt::Std.

But Getopt::Std is still the second-most popular CPAN module when it comes to parsing command-line options, if you look at the number of reverse dependencies it has (167, after Getopt::Long which has 1127). Maybe this is because some people prefer using only short options. Also note that since these two modules are core, some distributions do not specify them as dependencies. Which means that the number of reverse dependencies for these two are actually higher.

One advantage of using Getopt::Std is its dead-simple API. You just call getopts() and user-specified options will be collected in $opt_* variables. Perfect if you have a short script that only accepts a few one-letter options.

Tab completion. If you have a Getopt::Std-based CLI script, your users can use shcompgen to get tab completion, because shcompgen now supports detecting Getopt::Std-based scripts and activating tab completion for such scripts. It only supports completing option names though. To be more useful (completing option values and arguments) you will need to use one of the other Getopt modules, e.g. Getopt::Long::Complete.

Getopt modules 01: Getopt::Long

About this mini-article series. Each day for 24 days, I will be reviewing a module that parses command-line options (such module is usually under the Getopt::* namespace).

Today, our module is Getopt::Long. I actually have reviewed it in a couple of my Perinci::CmdLine tutorial posts: Getopt::Long and What’s wrong with Getopt::Long. To recap: Getopt::Long is a core module that should be your go-to module for parsing command-line options.

Two things to remember. First, you should start your code with something like this:

Getopt::Long::Configure("bundling", "no_ignore_case", "permute", "no_getopt_compat");

bundling is to enable you to say -abc instead of -a -b -c if you happen to have these short options. Most Unix programs allow this. no_ignore_case is so that Getopt::Long differentiates -v and -V. Most Unix programs also behave like this, they are not case-insensitive when it comes to command-line options. Since there are only so many Latin letters, very often the lowercase letter and uppercase letter are used for different purposes.

permute is to allow you to say --option val --flag arg1 arg2 --another-flag. That is, you intersperse command-line options and arguments. For convenience, many Unix programs behave like this. For example, you can say ls -l A* or ls A* -l. By default (under no_permute mode), Getopt::Options requires a user to specify all options first before any argument, which is rather inconvenient.

no_getopt_compat is to disable interpreting +foo the same as --foo. Most programs nowaday do not interpret + as the start of command-line options anymore. Enabling getopt_compat (the default) only serves to interfere, for example if you have a filename like +foo then you’ll have to write it as ./+foo to avoid it being parsed as command-line options.

These modes should be the default, right? Just like use strict and use warnings should be the default in Perl. But for the sake of backward compatibility, they aren’t.

Tab completion. Another thing I want to add is: if you have a Getopt::Long-based CLI script, aside from modifying your script to use Getopt::Long::Complete instead, your users can now also use shcompgen to get tab completion, because shcompgen now supports detecting Getopt::Long-based scripts and activating tab completion for such scripts.

What’s next for bash, completion-wise?

Trying out completion feature in several other shells which I don’t use daily–including zsh, tcsh, and fish–I can’t help but comparing them with bash.

IMHO, the last major feature in completion in bash happens in 2009-2010, when bash 4.1 introduces -D option for the “complete” command. This enables fallback/catch-all mechanism like already found in other shells like fish and zsh. When a user requests completion for a command that does not yet have a completion definition, the hook function specified in “complete -D” can execute and find a completion definition somewhere. And the completion can be activated right there and then instead of having to wait for the next command (or after the user logs out and logs in again). A major convenience as completion can be activated or deactivated instantly.

The subsequent major bash versions don’t introduce anything ground-breaking in terms of completion: 4.2 allows us to configure the number of columns used when displaying completion (nice, but not an additional core functionality) and case-map to treat underscore and dash as the same (really convenient, but we can do that ourselves if we want using function or external command backend). 4.3 introduces “-o noquote” and 4.4 introduces “-o nosort” which are just minor.

Completion description. As many bash users who have tasted fish and zsh would agree, I think bash really needs to add the feature of showing description/help text next to each completion answer. This is a major boost for CLI usability. For example, user can see or be reminded of what each command option does instead of having to “man” or open a browser to Google for it.

Menu select. The other popular feature is “menu select” like in zsh (not to be confused with the already existing option “menu-completion” in bash), where after the user presses Tab and is presented with the list of completions, she can use arrow keys to select the completion she wants instead of typing. This is nice but of lesser impact compared to the previous item. A seasoned CLI user would prefer and can complete faster using typing anyway. What I think would be really nifty is incremental matching, where the list of completions is reduced or expanded as the user types. So for example you press “deluser t” and get presented with a list of 30 usernames starting with “t”. You can now type more letters to match fewer of those names until you get the one you want. The list displayed interactively shrinks or reexpands to show only the matching items. The exact detail of how this would work can be tuned to be as comfortable and powerful as possible. What I described just now is actually just a UI (TUI?) improvement of the functionality already present, as when we use tab completion we often do just that, albeit without the interactive list being displayed automatically (we still need to press Tab whenever we want to get the list of completions).

Colors. Fish utilizes colors a lot, for good purpose. For example if you type “ls -” in fish you’ll get a much nicer output compared to in bash. This lets you scan the list faster. It would be nice if we can show colors more in the list of completion in bash.

Adding support for fish, zsh, tcsh in shcompgen

I’ve recently added support for the other three shells (fish, zsh, tcsh) in shcompgen. shcompgen is basically a utility to write those shell commands “complete -C foo foo” or “complete -c foo -l longopt1 –description ‘Add a thing to foo'” for you. It recognizes scripts written using Getopt::Long::Complete, Perinci::CmdLine, and a few others so that you can enable shell tab completion for your scripts.

fish. Enabling tab completion for a command in fish is relatively simple. For each short/long option of a command, you can define a separate “complete” command, e.g.:

complete -c man -s k --description "Show apropos information"
complete -rc man -s C --description "Configuration file"
complete -xc man -a 1 --description "Program section"
complete -xc man -a 2 --description "Syscall section"
complete -xc man -a 3 --description "Library section"
...

Doing this has the advantage of fish knowing about the each program option and its description, so you can get a prettier/more informative completion. It is not possible to just say like in bash “complete -F somefunc cmd; # delegate completion to a function” or “complete -C somecmd cmd ; # delegate completion to an external command”. It’s also possible to just delegate to the program entirely a la bash’s “complete -C”:

complete -c somecmd -a '(begin; set -lx COMP_SHELL fish; set -lx COMP_LINE (commandline); set -lx COMP_POINT (commandline -C); shcompgen; end)'

zsh. Completion in zsh is complex and complicated with lots and lots of options, if not featureful. You can, in theory, use “complete” or “compgen” command like in bash because zsh has “bashcompinit” that (partially) simulates those two bash commands. This enables you to reuse your bash completion definitions in zsh. I tried to do that but didn’t succeed though.

#compdef pmman
autoload bashcompinit
bashcompinit
# this is bash-style
complete -C pmman pmman

The commands I type will sometimes complete, but at other times won’t. So I use “compadd” instead, which is the standard way to add completion results in zsh. For example:

#compdef pmman
_pmman() {
 si=$IFS
 compadd -- $(COM_SHELL=zsh COMP_LINE=$BUFFER COMP_POINT=$CURSOR pmman)
 IFS=$si
}
_pmman "$@"

tcsh. tcsh lacks a fallback or autoload mechanism (like “complete -D” in bash or similar mechanism in fish and zsh), so activating or deactivating completion for a command requires you to explicitly re-source a definition script or logout + login again.

Tab completion now works in zsh, fish, and tcsh but since I don’t use those shells daily and am not familiar enough with them, there are still known issues (documented in the shcompgen’s POD) like with escaping of special characters like whitespace. I hope that Perl programmers that use one of those shells can give inputs on how to resolve the issues.

Adding tab completion for perlbrew

perlbrew is a command-line utility I’m using quite a bit recently: while developing Bencher feature of benchmarking against multiple perls, for trying out cperl, or just updating to the latest perl release. So I thought it would be nice to add tab completion feature to perlbrew.

The obvious choice (for many people anyway) to write tab completion feature in is bash, but I’m more comfortable with Perl. And besides, there are a few nice completion features in Complete::Util I’d like to use.

The result is App::ShellCompleter::perlbrew. You install it by first installing App::shcompgen from CPAN and then:

% shcompgen init

then install App::ShellCompleter::perlbrew from CPAN.

Some of the things that the completion can do:

Complete subcommands, option names, option values, arguments

For example:

% perlbrew un<tab>

will complete to:

% perlbrew uninstall _

The completion features “word-mode” matching, so you can also do something like this:

% perlbrew i-cp<tab>

and it will complete to:

% perlbrew install-cpanm _

Display the list of available perls to install

% perlbrew install <tab>

The first time you do this, it will take several seconds because the completion script will fetch the list of available perls from “perlbrew available”. After that it should be instantaneous because the completion script caches the result in a temporary file.

Display the list of installed perls

It can also do “char-mode” or “fuzzy” matching for increased convenience. For example, type this:

% perlbrew switch 10<tab>

and it will complete to (assuming you have perl 5.10.1 installed):

% perlbrew switch 5.10.1

Source code

The source code for _perlbrew is about 300 lines and I believe is fairly easy to write.