pericmd 048: Showing table data in browser as sortable/searchable HTML table

The latest release of Perinci::CmdLine (1.68) supports viewing program’s output in an external program. And also a new output format is introduced: html+datatables. This will show your program’s output in a browser and table data is shown as HTML table using jQuery and DataTables plugin to allow you to filter rows or sort columns. Here’s a video demonstration:

Your browser does not support the video tag, or WordPress filters the VIDEO element.

If the video doesn’t show, here’s the direct file link.

Getopt modules: Epilogue

About this mini-article series. For each of the past 24 23 days, I have reviewed a module that parses command-line options (such module is usually under the Getopt::* namespace). First article is here.

This series was born out of my experimentations with option parsing and tab completion, and more broadly of my interest in doing CLI with Perl. Aside from writing this series, I've also released numerous modules related to option parsing, some of them are purely experimental in nature and some already used in production.

It has been interesting evaluating the various modules: the sometimes unconventional or seemingly odd approach that they take, or the specific features that they offer. Not all of them are worth using, but at least they provide perspectives and some lessons for us to learn.

Of course, not all modules got reviewed. There are simply far more than 24 modules (lcpan tells me that there are 180 packages in the Getopt:: namespace alone, with 94 distributions having the name Getopt-*). I tried to cover at least the must-know ones, core ones, and the popular ones. Other than that, frankly the selection is pretty much random. I picked what's interesting to me or what I can make some points about, whether they are negative or positive points.

I have skipped many modules that are just yet another Getopt::Long wrapper which adds per-option usage or some other features found in Getopt::Long::Descriptive (GLD). Not that they are worse than GLD, for some reason or another they just didn't get adopted widely or at all. A couple examples of these: Getopt::Helpful, Getopt::Fancy.

Modules which use Moose, except MooseX::Getopt, automatically get skipped by me because their applicability is severely limited by the high number of dependencies and high startup overhead (200-500ms or even more on slower computers). These include: Getopt::Flex, Getopt::Alt, Getopt::Chain.

Some others are simply too weird or high in "WTF number", but I won't name names here.

Except for App::Cmd and App::Spec, I haven't really touched CLI frameworks in general. There are no shortages of CLI frameworks on CPAN too, perhaps for another series?

I've avoided reviewing my own modules, which include Getopt::Long::Complete (Getopt::Long wrapper which adds tab completion), Getopt::Long::Subcommand (Getopt::Long wrapper, with support for subcommands), Getopt::Long::More (my most recent Getopt::Long wrapper which adds tab completion and other features), Getopt::Long::Less & Getopt::Long::EvenLess (two leaner versions of Getopt::Long for the specific goal of reducing startup overhead), Getopt::Panjang (a break from Getopt::Long interface compatibility to explore new possibilities), and a CLI framework Perinci::CmdLine (which currently uses Getopt::Long but plans to switch backend in the long run; I've written a whole series of tutorial posts for this module).

In general, I'd say that you should probably try to stick with Getopt::Long first. As far as option parsing is concerned, it's packed with features already, and it has the advantage of being a core module. But as soon as you want: automatic autohelp/automessage generation, subcommand, tab completion then you should begin looking elsewhere.

Unfortunately except for evaluating Perl ports of some option parsing libraries (like Smart::Options, Getopt::ArgParse, Getopt::Kingpin), I haven't got the chance to deeply look into how option parsing is done in other languages. Among the other languages is Perl's own sister Perl 6, which offers built-in command-line option parsing. This endeavor of researching option parsing in other languages could potentially offer more lessons and perspectives.

I hope this series is of use to some people. Merry christmas and happy holidays to everybody.

Getopt modules 23: Getopt::Complete

About this mini-article series. Each day for 24 days, I will be reviewing a module that parses command-line options (such module is usually under the Getopt::* namespace). First article is here.

Getopt::Complete (GC) is a module written by Scott Smith (SAKOHT) in 2009 and also co-maintained by Nathan Nutter (NNUTTER). Last release is in 2011. So far it registers one CPAN distribution depending on it, although it's written by Scott himself.

Shell tab completion is a topic which I have been interested in since around 2012. I've released numerous modules related to completion, including two option parsing modules Getopt::Long::Complete (GLC) and Getopt::Long::More (GLM) which sports completion as (one of) its selling point, so it's natural that I want to compare them to Getopt::Complete. Throughout the article I'll be repeatedly doing those comparisons, and I hope it's not becoming too annoying.

Interface

GC, like GLC and GLM, is a Getopt::Long (GL) wrapper that adds tab completion feature. To let the module detect tab completion mode and return completion answer as soon as possible, GC offers this interface:

use Getopt::Complete (
    'frog'        => ['ribbit','urp','ugh'],
    'fraggle'     => sub { return ['rock','roll'] },
    'quiet!'      => undef,
    'name'        => undef,
    'age=n'       => undef,
    'outfile=s@'  => 'files',
    'outdir'      => 'directories',
    'runthis'     => 'commands',
    'username'    => 'users',
    ''          => 'directories',
);

That is, it accepts the options specification as import arguments. This looks simple but presents its own inconveniences.

The second thing you'll notice that the options specification are different than GL. While GLC and GLM choose to use an interface that is backward-compatible with GL, GC focuses on tab completion. The values of the pairs in the options specification is not a variable reference/coderef as you would expect in GL, but solely completion specification: it's either undef (meaning the option does not require argument), a string (meaning a completion type/routine to use, e.g. files to complete from filenames, commands to complete from program names in PATH, and so on. The options values themselves are collected in %ARGS.

Thus, compared to GLC and GLM, specifying completion routines is simpler in GC (but I also wrote Shell::Completer to provide the same level of convenience with more flexibility).

Activating Completion

To activate completion in bash, you need to declare this shell function first:

function _getopt_complete () {
COMPREPLY=($( COMP_CWORD=$COMP_CWORD perl `which ${COMP_WORDS[0]}` ${COMP_WORDS[@]:0} ));
}

then for each CLI application you also need to do:

% complete -F _getopt_complete myapp

This is different than the way you activate completion for GLC- or GLM-based scripts:

% complete -C myapp myapp

External programs receive raw COMP_LINE and COMP_POINT environment variables from bash when doing tab completion, while shell functions are provided with the already-parsed command-line COMP_WORDS array variable and COMP_CWORD. GC wants to avoid parsing the command-line on its own, so the _getopt_complete function is used to give the Perl program parsed command-line arguments in @ARGV, and COMP_CWORD in another environment variable.

Using command-line that is already parsed by bash in COMP_WORDS has its pros as well as cons, due to the way that bash parses command-line for COMP_WORDS. So I cannot say which way is better, but what I can say is parsing COMP_LINE ourselves is more flexible.

Completion behavior and bugs

When you press tab after the command:

% myapp <tab>

GC offers only completion from the <> specification. In the above example, it only offers list of directories as answer. On the other hand, GLC and GLM also shows the list of available option names. With GC, to list the available options, you have to do:

% myapp -<tab>

I also cannot say that GLC's and GLM's way is better, but it certainly makes the CLI program more discoverable. By just pressing Tab, a user (especially a new user) can know more about what's possible.

GC has still a few problems. First of all, it cannot complete "–opt=" when COMP_WORDBREAKS contains "=". I have put workarounds for this issue in GLC and GLM. Second, it cannot handle filenames/directory names with spaces, or quotes, and probably other special characters too.

Third, GLC and GLM through Complete::Util offers some matching algorithms aside from simple prefix matching, for extra convenience. This is not offered by GC.

Getopt modules 22: Getopt::Kingpin

About this mini-article series. Each day for 24 days, I will be reviewing a module that parses command-line options (such module is usually under the Getopt::* namespace). First article is here.

Getopt::Kingpin is a port of Go's kingpin library, written by Masaaki Takasago (TAKASAGO) in 2016. It offers the usual "nowadays standard" features like: short and long options with short option bundling, automatic help/usage message generation, specifying that an option is required, default value, and subcommands. Two extra features are: specifying that an option can be set via environment variable of a certain name, and built-in completion (which is a feature from the original library but doesn't seem to be implemented yet in the Perl port). The Go library also allows templating of help message, and this is not yet supported by Getopt::Kingpin.

Like Smart::Options (reviewed a couple of days ago), kingpin is using the so-called "fluent style" interface, a.k.a. chained methods, which I find annoying to type in Perl due to the method call operator in Perl being -> instead of a single dot. Although fortunately the chained methods interface is slightly less annoying than in Smart::Options.

After looking at the 3 ports of option parsing libraries (the abovementioned two plus Getopt::ArgParse reviewed yesterday) it indeed seems that subcommand support is becoming a standard thing. Which makes me think about whether Getopt::Long should also add such feature, or whether we should promote some other option parsing library as the "best practice" when one wants to do subcommands. So far, I'm not seeing any single best candidate for "Getopt::Long + subcommand support".

Getopt modules 21: Getopt::ArgParse

About this mini-article series. Each day for 24 days, I will be reviewing a module that parses command-line options (such module is usually under the Getopt::* namespace). First article is here.

In contrast to in Perl, where the core modules Getopt::Std and Getopt::Long stand the test of time and remain the most popular ways people parse command-line options with in their Perl CLI scripts, in Python we encounter several churns of recommended standard modules.

First there is getopt, "C-style parser for command line options". To use getopt, you pass a string containing list of short options a la Getopt::Std, e.g. "ho:v" (meaning -o takes argument while h and v are flag switches), and also an array containing long options, e.g. ["help", "output="] (meaning –output takes argument while –help does not). But, instead of supplying references to variables to set, or coderefs (remember, specifying anonymous function is inconvenient in Python) like in Getopt::Long, in getopt programmers are asked to do a manual if-then-else and a loop (see the linked documentation for example). This is also quite similar in interface to the GetoptLong class in Ruby.

No doubt, this style of programming feels manual and tedious. Thus came optparse which is more OO and supposedly more Pythonic. Instead of passing a whole list of options at once, you now add one option (object) at a time using add_option method, along with more information for each option: usage/help message, type, whether the option is required, number of arguments expected, default value, and perhaps some callback. optparse's capability is equivalent to Getopt::Long or Getopt::Long::Descriptive, except that optparse makes some design choices, for example it is decidedly Unix-oriented, allowing only or as the option prefix (while Getopt::Long allows you to configure this). The documentation is quite probably the nicest aspect of this module: it does not assume much knowledge (like familiarity with Unix or CLI) from the readers and explains at length what an option is and how should one design a CLI program with regards to accepting options. I realized that "required option" is indeed an oxymoron from reading it!

But, as with Getopt::Long, optparse does not have the concept of subcommands. Thus arrived argparse. It is basically like optparse in appearance, except it has some extra features like the ability to specify positional arguments (in Getopt::Long, this is handled by the <> option specification) and support nested subcommands with the use of subparsers. Interestingly, argparse supports reading arguments from a file just like Getopt::ArgvFile, and this is the only form of "config file" it supports.

As things are right now, argparse becomes part of the standard library (a.k.a. core modules, in Perl parlance) while optparse is now deprecated and might be removed. However, getopt remains.

There is a Perl port of argparse on CPAN called Getopt::ArgParse, created by M ytraM (MYTRAM) in 2013 and last updated in 2015. It is not feature-by-feature equivalent to its Python original, because of language differences and because argparse still accumulates features over time. You get some basic features like autohelp/autousage message, default value, setting an option as required, setting number of expected arguments, as well as subparsers for subcommand support (although not yet nested in Getopt::ArgParse). The type/validation feature is weak or almost nonexistent; perhaps a custom validation routine should be allowed to be specified or more can be explored here.

What's rather disappointing from this port is its use of Getopt::Long (I was expecting a full port so option parsing should be done by itself) and Moo, significantly adding dependencies.

There is mention of configuration file in the documentation, but actually there is no explicit support of configuration file. Not even using "option file prefix" ala argparse or Getopt::ArgvFile.

All in all, I'm not seeing something to make me prefer this module. If you do not use subcommands, I recommend sticking with Getopt::Long or Getopt::Long::Descriptive. If you do use subcommands, perhaps also consider a CLI framework like App::Cmd, or Getopt::Long::Subcommand.

Getopt modules 20: Smart::Options

About this mini-article series. Each day for 24 days, I will be reviewing a module that parses command-line options (such module is usually under the Getopt::* namespace). First article is here.

In the next few days, I'll be reviewing Perl ports of some popular option parsing modules from other languages. Today: Smart::Options.

Summary

Smart::Options is written by KAN Fushihara (MIKIHOSHI) and is a Perl port*) of node's optimist package, which in turns uses minimist as the option parsing engine and adds some stuffs, mainly the ability to generate usage/help message. Ironically, optimist is now deprecated in favor of yargs which is roughly the same as optimist but does its own parsing and adds features like bash completion.

So minimist is roughly the equivalent of Getopt::Long, optimist is the equivalent of Getopt::Long::Descriptive, yargs is roughly the equivalent of Getopt::Long::Descriptive + tab completion (like Getopt::Long::Complete or Getopt::Long::More).

You can get an idea of the sheer number of packages in npm, the CPAN equivalent in the node ecosystem, by looking at these numbers: compared to Getopt::Long's 1127 dependents, minimist has 5768 dependents. And it isn't even the most popular option parsing package. The most popular one on npm is currently commander (from the legendary TJ Holowaychuk) which has 12252 dependents! yargs has 4073, optimist has 3546 (remember that optimist has been declared as deprecated), and nomnom (another deprecated option parsing package) still has 510.

Currently there is no CPAN distribution depending on Smart::Options.

commander itself resembles Getopt::Long::Descriptive a bit more in its interface. I didn't find any Perl port of commander on CPAN though.

But I digress. Let's go back Smart::Options and optimist. As I said earlier, optimist is roughly equivalent to Getopt::Long::Descriptive. Except for one main difference: you are not required to specify any specification. Without any specification, the library will simply accept any option and put it in a hash. But remember that without specification, you cannot check for an unknown option or get auto-abbreviation.

Bundling of short one-letter options is supported, but if you don't provide specification the library cannot differentiate which short options require value and which ones don't: the library will simply assume that all short options are just flags which don't take value.

Another difference is the usage of OO and method chaining.

Usage

Here's how one would use Smart::Options in the simplest way (without any specification):

use 5.010;
use Smart::Options;
my $opts = argv(); # you can also say: $opts = Smart::Options->new->parse
say "foo = ", $opts->{foo};
say "b = ", $opts->{b};
say "args = [", join(", ", @{ $opts->{_} }), "]";
say "ARGV = [", join(", ", @ARGV), "]";

Let's try to run it:

% ./script.pl –foo 10 -b — a b c
foo = 10
b = 1
args = [a, b, c]
ARGV = [–foo, 10, -b, –, a, b, c]

As you can see, the command-line arguments will be put in the _ key. And unlike Getopt::Long, it does not modify @ARGV.

One nitpick: the argv() or the parse() function (or method) can accept a list to parse options from array other than @ARGV, but since it accepts a list instead of arrayref, when you pass a zero-length array it will assume that you don't pass any array and so still defaults to @ARGV. This can be remedied, e.g., by accepting an arrayref instead.

Without options specification, it's not possible to declare an option to be required, repeatable, or as a flag. So let's add some specification:

use 5.010;
use Smart::Options;
my $opts = Smart::Options->new
    ->demand('foo')                     ->describe(foo => 'The foo option')
    ->default(bar => 3)->alias(b => bar)->describe(bar => 'The bar option')
    ->default(baz => 5)                 ->describe(baz => "The baz option")
    ->parse;
say "foo = ", $opts->{foo};
say "b = ", $opts->{b};
say "args = [", join(", ", @{ $opts->{_} }), "]";

After this, you can generate help message:

$ ./script.pl –help
Usage: ./script.pl

Options:
-b, –bar The bar option [default: 3]
–baz The baz option [default: 5]
–foo The foo option [required]
-h, –help Show help

Missing required arguments: foo

BTW, some option parsing modules, including Smart::Options, still complain about missing –foo when we instruct it to show help message (–help), like shown above. I think this behavior is a bug and should be fixed.

Other features

*) I said earlier that Smart::Options is a port of optimist. It is actually more accurately a blend between optimist and Kan's older module opts. So beyond optimist, Smart::Options adds some more (quite substantial) features, which do not exist even in yargs or commander.

Validation. Like in Getopt::Long, you can add some validation. You can declare an option to accept Bool, Int, Num, Str, ArrayRef (this is similar to Getopt::Long's @ destination type to make option repeatable), HashRef (if say foo is declared as a hashref, you can specify –foo.key1 or –foo.key2 in the command-line and so on), or Config.

Configuration file. The last type, Config, is actually supposed to let you specify a filename to make the module reads an INI-like configuration file. But perhaps this configuration is misplaced and conflated, as this is not a type/validation configuration, and it is not per-option but global.

Coercion. This can be used to convert an option value which is scalar/string to, say, Path::Tiny instance.

Subcomands. This lets you support (nested) subcommands by adding a nested Smart::Options object inside another, like in Getopt::Long::Subcommand. For example:

my $opts = Smart::Options->new
    ->subcmd(subcmd1 => Smart::Options->new->...)
    ->subcmd(subcmd1 => Smart::Options->new->...)
    ->parse;

DSL. If you don't like the chained methods syntax, there's Smart::Options::Declare which offers an alternative interface to declare an option one by one much like Moose's has. Although it doesn't seem to support declaring subcommands yet.

Performance

The startup overhead of Smart::Options is roughly the same as Getopt::Long::Descriptive, while the memory usage is higher.

% bencher-module-startup-overhead Smart::Options Getopt::Long::Descriptive
+—————————+——————————+——————–+—————-+———–+————————+————+———–+———+
| participant | proc_private_dirty_size (MB) | proc_rss_size (MB) | proc_size (MB) | time (ms) | mod_overhead_time (ms) | vs_slowest | errors | samples |
+—————————+——————————+——————–+—————-+———–+————————+————+———–+———+
| Smart::Options | 4.2 | 8 | 33 | 36 | 33.9 | 1 | 0.00018 | 20 |
| Getopt::Long::Descriptive | 0.82 | 4.5 | 23 | 35 | 32.9 | 1 | 9.9e-05 | 20 |
| perl -e1 (baseline) | 4.9 | 9 | 38 | 2.1 | 0 | 17 | 1.5e-05 | 20 |
+—————————+——————————+——————–+—————-+———–+————————+————+———–+———+

Also to be noted is that Smart::Options does not use Getopt::Long but does its own parsing.

Verdict

I find optimist and yargs themselves don't offer any new feature not already existing in Getopt::Long or Getopt::Long::Descriptive (the completion feature can be done with shcompgen). But Smart::Options does offer some extra features like subcommand support and reading of configuration file. On the other hand, you lose some of Getopt::Long's features like: auto-abbreviation and custom handler (in Getopt::Long, you can assign a coderef to an option which can do anything, like printing a message early and exiting, or setting other variable or multiple variables, or whatever).

My problem with this module is the interface: method chaining has its uses (for example I find it convenient in some JSON module or in jQuery) but here it just distracts and make options specification visually convoluted. On the other hand, the DSL alternative interface is not complete (yet).

I personally would still reach for my Perinci::CmdLine most of the time. But I will prefer Smart::Options over App::Options (which is also covered in this mini-article series).

Getopt modules 19: App::Spec

About this mini-article series. Each day for 24 days, I will be reviewing a module that parses command-line options (such module is usually under the Getopt::* namespace). First article is here.

In the previous article I discussed App::Cmd, which is a nice, simple CLI framework that supports subcommands by requiring you to write a subcommand class for each subcommand you want to add. And it also lets you specify Getopt::Long::Descriptive command-line options directly so you can be as custom as Getopt::Long::Descriptive lets you to be. However, many high-level features are missing.

There exists many more CLI frameworks on CPAN, like there are option parsing libraries, some closer to App::Cmd (except, say, being Moo- or Moose-specific) while others try to provide more said features.

App::Spec is one module. It is closer in features to my Perinci::CmdLine with the main difference being that App::Spec is OO (although it uses a single class and different methods to support subcommands instead of a separate class for each subcommand) while Perinci::CmdLine is decidedly not. Here are the features that it supports (or want to support, as it’s not quite polished or finished yet): a specification for CLI app (summary/description, list of subcommands (possibly nested), and parameters/options for each subcommand), extra validation, automatic help/usage message generation, and shell tab completion. App::Spec is relatively new (2016) and written by Tina Muller (TINITA). No applications on CPAN are using it right now. There is actually an App::Spec article on this year‘s Perl Advent Calendar so I’ll just direct you to reading the article instead of describing it myself.

What’s good about App::Spec is that it does not use Moo or Moose, so you can use it for applications you want to be light. It’s also not too heavy on the OO side. It provides shell tab completion out of the box; we need more frameworks like this because tab completion is one of pillars of usability on the CLI. I hope the completion feature improves in the future.

What I find in App::Spec not really to my liking includes: low-level (manual) mapping to Getopt::Long specification format (I prefer an automatic mapper like in MooseX::Getopt or my Perinci::CmdLine), splitting options into “options” and “parameters” (unnecessary, they’re all options to me, “options” just happen to be boolean switches while “parameters” have values like string or whatever).