Getopt modules 22: Getopt::Kingpin

About this mini-article series. Each day for 24 days, I will be reviewing a module that parses command-line options (such module is usually under the Getopt::* namespace). First article is here.

Getopt::Kingpin is a port of Go's kingpin library, written by Masaaki Takasago (TAKASAGO) in 2016. It offers the usual "nowadays standard" features like: short and long options with short option bundling, automatic help/usage message generation, specifying that an option is required, default value, and subcommands. Two extra features are: specifying that an option can be set via environment variable of a certain name, and built-in completion (which is a feature from the original library but doesn't seem to be implemented yet in the Perl port). The Go library also allows templating of help message, and this is not yet supported by Getopt::Kingpin.

Like Smart::Options (reviewed a couple of days ago), kingpin is using the so-called "fluent style" interface, a.k.a. chained methods, which I find annoying to type in Perl due to the method call operator in Perl being -> instead of a single dot. Although fortunately the chained methods interface is slightly less annoying than in Smart::Options.

After looking at the 3 ports of option parsing libraries (the abovementioned two plus Getopt::ArgParse reviewed yesterday) it indeed seems that subcommand support is becoming a standard thing. Which makes me think about whether Getopt::Long should also add such feature, or whether we should promote some other option parsing library as the "best practice" when one wants to do subcommands. So far, I'm not seeing any single best candidate for "Getopt::Long + subcommand support".

Getopt modules 21: Getopt::ArgParse

About this mini-article series. Each day for 24 days, I will be reviewing a module that parses command-line options (such module is usually under the Getopt::* namespace). First article is here.

In contrast to in Perl, where the core modules Getopt::Std and Getopt::Long stand the test of time and remain the most popular ways people parse command-line options with in their Perl CLI scripts, in Python we encounter several churns of recommended standard modules.

First there is getopt, "C-style parser for command line options". To use getopt, you pass a string containing list of short options a la Getopt::Std, e.g. "ho:v" (meaning -o takes argument while h and v are flag switches), and also an array containing long options, e.g. ["help", "output="] (meaning –output takes argument while –help does not). But, instead of supplying references to variables to set, or coderefs (remember, specifying anonymous function is inconvenient in Python) like in Getopt::Long, in getopt programmers are asked to do a manual if-then-else and a loop (see the linked documentation for example). This is also quite similar in interface to the GetoptLong class in Ruby.

No doubt, this style of programming feels manual and tedious. Thus came optparse which is more OO and supposedly more Pythonic. Instead of passing a whole list of options at once, you now add one option (object) at a time using add_option method, along with more information for each option: usage/help message, type, whether the option is required, number of arguments expected, default value, and perhaps some callback. optparse's capability is equivalent to Getopt::Long or Getopt::Long::Descriptive, except that optparse makes some design choices, for example it is decidedly Unix-oriented, allowing only or as the option prefix (while Getopt::Long allows you to configure this). The documentation is quite probably the nicest aspect of this module: it does not assume much knowledge (like familiarity with Unix or CLI) from the readers and explains at length what an option is and how should one design a CLI program with regards to accepting options. I realized that "required option" is indeed an oxymoron from reading it!

But, as with Getopt::Long, optparse does not have the concept of subcommands. Thus arrived argparse. It is basically like optparse in appearance, except it has some extra features like the ability to specify positional arguments (in Getopt::Long, this is handled by the <> option specification) and support nested subcommands with the use of subparsers. Interestingly, argparse supports reading arguments from a file just like Getopt::ArgvFile, and this is the only form of "config file" it supports.

As things are right now, argparse becomes part of the standard library (a.k.a. core modules, in Perl parlance) while optparse is now deprecated and might be removed. However, getopt remains.

There is a Perl port of argparse on CPAN called Getopt::ArgParse, created by M ytraM (MYTRAM) in 2013 and last updated in 2015. It is not feature-by-feature equivalent to its Python original, because of language differences and because argparse still accumulates features over time. You get some basic features like autohelp/autousage message, default value, setting an option as required, setting number of expected arguments, as well as subparsers for subcommand support (although not yet nested in Getopt::ArgParse). The type/validation feature is weak or almost nonexistent; perhaps a custom validation routine should be allowed to be specified or more can be explored here.

What's rather disappointing from this port is its use of Getopt::Long (I was expecting a full port so option parsing should be done by itself) and Moo, significantly adding dependencies.

There is mention of configuration file in the documentation, but actually there is no explicit support of configuration file. Not even using "option file prefix" ala argparse or Getopt::ArgvFile.

All in all, I'm not seeing something to make me prefer this module. If you do not use subcommands, I recommend sticking with Getopt::Long or Getopt::Long::Descriptive. If you do use subcommands, perhaps also consider a CLI framework like App::Cmd, or Getopt::Long::Subcommand.

Getopt modules 20: Smart::Options

About this mini-article series. Each day for 24 days, I will be reviewing a module that parses command-line options (such module is usually under the Getopt::* namespace). First article is here.

In the next few days, I'll be reviewing Perl ports of some popular option parsing modules from other languages. Today: Smart::Options.

Summary

Smart::Options is written by KAN Fushihara (MIKIHOSHI) and is a Perl port*) of node's optimist package, which in turns uses minimist as the option parsing engine and adds some stuffs, mainly the ability to generate usage/help message. Ironically, optimist is now deprecated in favor of yargs which is roughly the same as optimist but does its own parsing and adds features like bash completion.

So minimist is roughly the equivalent of Getopt::Long, optimist is the equivalent of Getopt::Long::Descriptive, yargs is roughly the equivalent of Getopt::Long::Descriptive + tab completion (like Getopt::Long::Complete or Getopt::Long::More).

You can get an idea of the sheer number of packages in npm, the CPAN equivalent in the node ecosystem, by looking at these numbers: compared to Getopt::Long's 1127 dependents, minimist has 5768 dependents. And it isn't even the most popular option parsing package. The most popular one on npm is currently commander (from the legendary TJ Holowaychuk) which has 12252 dependents! yargs has 4073, optimist has 3546 (remember that optimist has been declared as deprecated), and nomnom (another deprecated option parsing package) still has 510.

Currently there is no CPAN distribution depending on Smart::Options.

commander itself resembles Getopt::Long::Descriptive a bit more in its interface. I didn't find any Perl port of commander on CPAN though.

But I digress. Let's go back Smart::Options and optimist. As I said earlier, optimist is roughly equivalent to Getopt::Long::Descriptive. Except for one main difference: you are not required to specify any specification. Without any specification, the library will simply accept any option and put it in a hash. But remember that without specification, you cannot check for an unknown option or get auto-abbreviation.

Bundling of short one-letter options is supported, but if you don't provide specification the library cannot differentiate which short options require value and which ones don't: the library will simply assume that all short options are just flags which don't take value.

Another difference is the usage of OO and method chaining.

Usage

Here's how one would use Smart::Options in the simplest way (without any specification):

use 5.010;
use Smart::Options;
my $opts = argv(); # you can also say: $opts = Smart::Options->new->parse
say "foo = ", $opts->{foo};
say "b = ", $opts->{b};
say "args = [", join(", ", @{ $opts->{_} }), "]";
say "ARGV = [", join(", ", @ARGV), "]";

Let's try to run it:

% ./script.pl –foo 10 -b — a b c
foo = 10
b = 1
args = [a, b, c]
ARGV = [–foo, 10, -b, –, a, b, c]

As you can see, the command-line arguments will be put in the _ key. And unlike Getopt::Long, it does not modify @ARGV.

One nitpick: the argv() or the parse() function (or method) can accept a list to parse options from array other than @ARGV, but since it accepts a list instead of arrayref, when you pass a zero-length array it will assume that you don't pass any array and so still defaults to @ARGV. This can be remedied, e.g., by accepting an arrayref instead.

Without options specification, it's not possible to declare an option to be required, repeatable, or as a flag. So let's add some specification:

use 5.010;
use Smart::Options;
my $opts = Smart::Options->new
    ->demand('foo')                     ->describe(foo => 'The foo option')
    ->default(bar => 3)->alias(b => bar)->describe(bar => 'The bar option')
    ->default(baz => 5)                 ->describe(baz => "The baz option")
    ->parse;
say "foo = ", $opts->{foo};
say "b = ", $opts->{b};
say "args = [", join(", ", @{ $opts->{_} }), "]";

After this, you can generate help message:

$ ./script.pl –help
Usage: ./script.pl

Options:
-b, –bar The bar option [default: 3]
–baz The baz option [default: 5]
–foo The foo option [required]
-h, –help Show help

Missing required arguments: foo

BTW, some option parsing modules, including Smart::Options, still complain about missing –foo when we instruct it to show help message (–help), like shown above. I think this behavior is a bug and should be fixed.

Other features

*) I said earlier that Smart::Options is a port of optimist. It is actually more accurately a blend between optimist and Kan's older module opts. So beyond optimist, Smart::Options adds some more (quite substantial) features, which do not exist even in yargs or commander.

Validation. Like in Getopt::Long, you can add some validation. You can declare an option to accept Bool, Int, Num, Str, ArrayRef (this is similar to Getopt::Long's @ destination type to make option repeatable), HashRef (if say foo is declared as a hashref, you can specify –foo.key1 or –foo.key2 in the command-line and so on), or Config.

Configuration file. The last type, Config, is actually supposed to let you specify a filename to make the module reads an INI-like configuration file. But perhaps this configuration is misplaced and conflated, as this is not a type/validation configuration, and it is not per-option but global.

Coercion. This can be used to convert an option value which is scalar/string to, say, Path::Tiny instance.

Subcomands. This lets you support (nested) subcommands by adding a nested Smart::Options object inside another, like in Getopt::Long::Subcommand. For example:

my $opts = Smart::Options->new
    ->subcmd(subcmd1 => Smart::Options->new->...)
    ->subcmd(subcmd1 => Smart::Options->new->...)
    ->parse;

DSL. If you don't like the chained methods syntax, there's Smart::Options::Declare which offers an alternative interface to declare an option one by one much like Moose's has. Although it doesn't seem to support declaring subcommands yet.

Performance

The startup overhead of Smart::Options is roughly the same as Getopt::Long::Descriptive, while the memory usage is higher.

% bencher-module-startup-overhead Smart::Options Getopt::Long::Descriptive
+—————————+——————————+——————–+—————-+———–+————————+————+———–+———+
| participant | proc_private_dirty_size (MB) | proc_rss_size (MB) | proc_size (MB) | time (ms) | mod_overhead_time (ms) | vs_slowest | errors | samples |
+—————————+——————————+——————–+—————-+———–+————————+————+———–+———+
| Smart::Options | 4.2 | 8 | 33 | 36 | 33.9 | 1 | 0.00018 | 20 |
| Getopt::Long::Descriptive | 0.82 | 4.5 | 23 | 35 | 32.9 | 1 | 9.9e-05 | 20 |
| perl -e1 (baseline) | 4.9 | 9 | 38 | 2.1 | 0 | 17 | 1.5e-05 | 20 |
+—————————+——————————+——————–+—————-+———–+————————+————+———–+———+

Also to be noted is that Smart::Options does not use Getopt::Long but does its own parsing.

Verdict

I find optimist and yargs themselves don't offer any new feature not already existing in Getopt::Long or Getopt::Long::Descriptive (the completion feature can be done with shcompgen). But Smart::Options does offer some extra features like subcommand support and reading of configuration file. On the other hand, you lose some of Getopt::Long's features like: auto-abbreviation and custom handler (in Getopt::Long, you can assign a coderef to an option which can do anything, like printing a message early and exiting, or setting other variable or multiple variables, or whatever).

My problem with this module is the interface: method chaining has its uses (for example I find it convenient in some JSON module or in jQuery) but here it just distracts and make options specification visually convoluted. On the other hand, the DSL alternative interface is not complete (yet).

I personally would still reach for my Perinci::CmdLine most of the time. But I will prefer Smart::Options over App::Options (which is also covered in this mini-article series).

Getopt modules 19: App::Spec

About this mini-article series. Each day for 24 days, I will be reviewing a module that parses command-line options (such module is usually under the Getopt::* namespace). First article is here.

In the previous article I discussed App::Cmd, which is a nice, simple CLI framework that supports subcommands by requiring you to write a subcommand class for each subcommand you want to add. And it also lets you specify Getopt::Long::Descriptive command-line options directly so you can be as custom as Getopt::Long::Descriptive lets you to be. However, many high-level features are missing.

There exists many more CLI frameworks on CPAN, like there are option parsing libraries, some closer to App::Cmd (except, say, being Moo- or Moose-specific) while others try to provide more said features.

App::Spec is one module. It is closer in features to my Perinci::CmdLine with the main difference being that App::Spec is OO (although it uses a single class and different methods to support subcommands instead of a separate class for each subcommand) while Perinci::CmdLine is decidedly not. Here are the features that it supports (or want to support, as it’s not quite polished or finished yet): a specification for CLI app (summary/description, list of subcommands (possibly nested), and parameters/options for each subcommand), extra validation, automatic help/usage message generation, and shell tab completion. App::Spec is relatively new (2016) and written by Tina Muller (TINITA). No applications on CPAN are using it right now. There is actually an App::Spec article on this year‘s Perl Advent Calendar so I’ll just direct you to reading the article instead of describing it myself.

What’s good about App::Spec is that it does not use Moo or Moose, so you can use it for applications you want to be light. It’s also not too heavy on the OO side. It provides shell tab completion out of the box; we need more frameworks like this because tab completion is one of pillars of usability on the CLI. I hope the completion feature improves in the future.

What I find in App::Spec not really to my liking includes: low-level (manual) mapping to Getopt::Long specification format (I prefer an automatic mapper like in MooseX::Getopt or my Perinci::CmdLine), splitting options into “options” and “parameters” (unnecessary, they’re all options to me, “options” just happen to be boolean switches while “parameters” have values like string or whatever).

Getopt modules 18: App::Cmd

About this mini-article series. Each day for 24 days, I will be reviewing a module that parses command-line options (such module is usually under the Getopt::* namespace). First article is here.

Traditionally, option parsing modules in Perl like Getopt::Std and Getopt::Long
do not have the concept of subcommands. The rising popularity of CLI programs with subcommands, specifically git and other post-CVS version control tools, has prompted the option parsing libraries to include the concept too, like in node’s commander or Python’s argparse. In Perl, there are not many option parsing libraries that offer this feature (although ports of other languages’ libraries including argparse exist on CPAN, which I’ll cover in the following days). That said, you can also use a higher-level library like CLI frameworks that support subcommands.

App::Cmd is one such module: it is an OO CLI application framework which happens to use Getopt::Long::Descriptive as the command-line options parser. Both modules are written by Ricardo “Rik” Signes (RJBS). App::Cmd was first released in 2006 and is still being updated. Around 60 CPAN distributions use App::Cmd (90 if we also count users of MooseX::App::Cmd), making it possibly the most popular CLI application framework on CPAN. Toby Inkster (TOBYINK) even once called it “the PSGI of the command-line world”, although I don’t think that analogy is appropriate. A very popular CLI application, dzil (Dist::Zilla), also by Rik, uses App::Cmd.

As mentioned, App::Cmd is meant to be used to write CLI application which has subcommands (or commands, as it call them) like ‘git’ or ‘dzil’ (with ‘git clone’ or ‘dzil build’ as examples of command with subcommand). To use App::Cmd in your application, you need to create a single application class and then one class for each subcommand you want to support. App::Cmd does not use Moo or Moose, making it more universally usable. Of course, your application or command classes can be Moo-based or Moose-based, as demonstrated by MooseX::App::Cmd and dzil. The CLI script itself is reduced to something like:

use YourApp; # your application class
YourApp->run;

Accepting and processing command-line options is pretty direct, if not low-level:

package YourApp::Command::cmd1; # a command class
sub opt_spec {
    return (
        [ "skip-check|C",  "skip checking stuffs", ],
        [ "sleep-between|s=i",  "delay between processing file", { default =>5 } ],
    );
}

# optional
sub usage_desc { "blah blah" }

# optional
sub validate_args {
    my ($self, $opt, $args) = @_;
    $self->usage_error("Please supply at least one file") unless @$args;
    $self->usage_error("Please specify a positive number") unless $opt->sleep_between >= 0;
}

sub execute {
    my ($self, $opt, $args) = @_;
    ...
}

You provide a method opt_spec in your command class to specify which command-line options your subcommand will accept. The value returned by this method will be passed directly to Getopt::Long::Descriptive. The parse result will be $opt object (which is something you normally get from the Getopt::Long::Descriptive’s describe_options function) as well as $args (the remaining command-line arguments from @ARGV after the options has been stripped from).

You can also provide the usage description to be passed to Getopt::Long::Descriptive’s describe_options via the usage_desc method. And an additional method validate_args to further validate $opt and $args if needed. The main method in a command class is execute, which is fed $opt and $args.

So you can see this does not differ much from a “traditional” CLI using Getopt::Long. You still provide the command-line options specification manually. This differs from other CLI frameworks like MooseX::Getopt or my Perinci::CmdLine which try to be more DRY by directly setting object attributes or function arguments from command-line options.

Another thing to note is that no configuration file support is baked in: you need to read and parse configuration files yourself. So basically what App::Cmd provides is the structure. For getting many higher-level CLI features, you need to do on your own. App::Cmd is mature and widely used, but you might also want to take a look at some other CLI frameworks that do more stuffs for you.

Getopt modules 17: Getopt::Modular

About this mini-article series. Each day for 24 days, I will be reviewing a module that parses command-line options (such module is usually under the Getopt::* namespace). First article is here.

Getopt::Modular is a Getopt::Long wrapper that lets you place some command-line options to one or more modules and then lets you combine them as you use the modules. For example:

# Module1.pm
use Getopt::Modular;
Getopt::Modular->acceptParam(
    opt1 => {
        spec => '=s',
        aliases => ['O'],
        help => 'This is option one',
        default => 'foo',
        validate => sub { ... },
    },
    opt2 => {
        ...
    },
);
# access the parameters somewhere in your code using:
if (Getopt::Modular->getOpt('foo')) { ... }
1;

# in Module2.pm
use Getopt::Modular;
Getopt::Modular->acceptParam(
    opt3 => { ... },
);
1;

# in myapp
use Getopt::Modular;
use Module1;
use Module2;
Getopt::Modular->parse_args; # program accepts options opt1, opt2, opt3

As you can see, aside from splitting command-line options over several modules, Getopt::Modular also lets you specify default value, usage/help message strings, and extra validation routine.

Getopt::Modular is written by Darin McBride (DMCBRIDE), first release is in 2008 and last updated in 2014. Currently no other CPAN distributions are using it. But Getopt::Modular inspired another module Getopt::Awesome (written by Pablo Fischer (PFISCHER) in 2009) which continues Getopt::Modular’s basic premise but with an alternative syntax.

The intention is good, to achieve modularity, but it’s modularity at the inappropriate level. If you want your code in a module to be more reusable and flexible (and everybody wants that), you accept parameters. The first attempt for accepting parameters should be function parameters (or if you are building an OO class, class attributes). If that is not suitable, for example if you want to parameterize a more global behavior, you use package variables or perhaps environment variable. Using Getopt::Modular (needlessly) ties the parameters to command-line, when your module might not be command-line-specific. The mapping is perhaps best done at the script level instead of at the modules.

That said, there are surely cases when this module is appropriate, for example if you are building a rather complex CLI application that you split into several modules, where the modules are CLI/application-specific. But even then, you should probably try to make the module not CLI-specific if you can.

Getopt modules 16: Getopt::Attribute

About this mini-article series. Each day for 24 days, I will be reviewing a module that parses command-line options (such module is usually under the Getopt::* namespace). First article is here.

When you are doing OO, mapping command-line options to your class attributes is convenient. But what if you are not using OO? There's Getopt::Attribute for that to map options to your package variables (there's also my Perinci::CmdLine that maps command-line options to function arguments, but I'm not reviewing my own modules in this series).

Getopt::Attribute is written by Marcel Grünauer (MARCEL), first in 2001 and last updated in 2010. Here are the users of this module on CPAN (mostly Marcel himself):

% lcpan rdeps Getopt::Attribute
+———+———-+——————–+——–+————–+————-+
| phase | rel | dist | author | dist_version | req_version |
+———+———-+——————–+——–+————–+————-+
| runtime | requires | Hopkins-Plugin-RPC | DIZ | 0.900 | 1.44 |
| runtime | requires | Module-Changes | MARCEL | 0.05 | 0 |
| runtime | requires | Module-Cloud | MARCEL | 1.100861 | 0 |
| runtime | requires | Task-MasteringPerl | BDFOY | 1.002 | 0 |
| runtime | requires | Vim-Complete | MARCEL | 1.100880 | 0 |
+———+———-+——————–+——–+————–+————-+

Here's how you would you Getopt::Attribute:

use Getopt::Attribute;

our $verbose : Getopt(verbose!);
our $all     : Getopt(all);
our $size    : Getopt(size=s);
our $more    : Getopt(more+);
our @library : Getopt(library=s);
our %defines : Getopt(define=s);
sub quiet : Getopt(quiet) { our $quiet_msg = 'seen quiet' }
usage() if our $man : Getopt(man);

As you can see, it uses a rather Perl-specific feature called subroutine attributes. You can then call your CLI app like this:

% myapp –all –size=10 –more –more –library L1 –library L2

then your variable $all will be set to 1, $size to 10, $more to 2, and @library to ["L1", "L2"].

The module code itself is surprisingly compact, less than 30 lines of code. If you wonder where the actual parsing is done, it's done in the INIT phase. So at least, unlike with App::Options, you can still utilize "perl -c" to syntax-check your scripts.

My main complaints are only: 1) my Emacs' cperl-mode still doesn't syntax-highlights these subroutine attributes correctly; 2) if you want to put all options to a single hash, you can't, so this module forces you to pick a particular style.

This module is a pure Getopt::Long wrapper that does not add additional features like putting summary string for each option (although that's doable putting it in the subroutine attribute as parameter), specifying required option, or specifying default value. It would make the module more interesting if it had those features.