pericmd 048: Showing table data in browser as sortable/searchable HTML table

The latest release of Perinci::CmdLine (1.68) supports viewing program’s output in an external program. And also a new output format is introduced: html+datatables. This will show your program’s output in a browser and table data is shown as HTML table using jQuery and DataTables plugin to allow you to filter rows or sort columns. Here’s a video demonstration:

Your browser does not support the video tag, or WordPress filters the VIDEO element.

If the video doesn’t show, here’s the direct file link.

Call for collaboration: mini-article series on object systems and Dist::Zilla plugins

Aside from the mini-article series on option parsing modules that I have recently completed, there are two more series I currently have in mind: one on object systems (Moose, Mouse, Moo, Mo, Object::Tiny, Class::Accessor, and so on) and another on Dist::Zilla plugins. However, as I don't feel familiar with many of the modules or plugins, I think it would be nice if more people could write the articles.

The format of the series is 24-25 (mini-)articles, each comprised of minimum 250-300 words, 3-5 paragraphs. This is equivalent to the amount of text of a typical essay one is expected to write in a TOEFL test in 20-30 minutes time. So it should relatively easy to write even for non-native English speakers. I personally find that writing a blog post in 30-60 minutes time is ideal: it can be done in one sitting and does not take too much time to interrupt your daily work. It can even be done during a break. There is no maximum length limit.

Each article should discuss or review a single module. Ideally the author of the article is not the author or (co-)maintainer of the reviewed module. The article should first describe the module (its brief history, popularity, position in the CPAN river) then proceed to discussion on the user interface and design of the module and finally close with the author's overall view on the module (whether the module is well designed, whether the module is useful to her, whether the module is worth using compared to other modules).

If you are interested in collaborating, please contact me at perlancar gmail. I expect the series on object systems to be posted on Feb 1st, 2017 and the series on dzil plugins on Apr 1st.

x_* prereqs

Do you know that aside from the standard phases "develop", "configure", "build", "test", "runtime", and the standard relationship types "requires", "recommends", "suggests", you can also put arbitrary phase and relationship type in your Perl distribution's dependencies (prerequisites, a.k.a. prereqs) using x_ prefix?

The CPAN Meta Spec v2 (CPAN::Meta::Spec) already allows this, but until fairly recently when using Dist::Zilla you can't produce those custom phase/relationship into your distribution's META.json because CPAN::Meta::Prereqs drops all but the standard phases/relationships. But since version 2.150006, thanks to Karen Etheridge (ETHER) (commit link) they are no longer dropped.

What would one use those custom phase/relationship for? So far, Kent Fredric (KENTNL) is using x_examples custom phase to specify prerequisites for examples (which normally will not be installed by your typical CPAN client). Karen plans to "use some new prereq categories for travis to make use of when deciding what prereqs to install for different tasks."

I myself am using x_spec relationship (in the develop phase) to specify that a distribution follows some specification, for example all my distributions which contain some Rinci metadata is peppered with this. Another one is x_embed relationship. This is when I embed a module source code into another module using fatpacking technique using Module::FatPack, to reduce dependencies. The embedding distribution no longer needs to specify a runtime requires dependency for the embedded module, but I still would like to record the fact that the module is being embedded, so I can know when I want to update the embedded source sometime later.

The possibilities are endless.

A somewhat related issue, which is broader, more complex, and hasn't been solved, is specifying dependencies to things that are not Perl modules, e.g. an OS package, a certain program, a certain C library. (References: #79, #82).

Getopt modules: Epilogue

About this mini-article series. For each of the past 24 23 days, I have reviewed a module that parses command-line options (such module is usually under the Getopt::* namespace). First article is here.

This series was born out of my experimentations with option parsing and tab completion, and more broadly of my interest in doing CLI with Perl. Aside from writing this series, I've also released numerous modules related to option parsing, some of them are purely experimental in nature and some already used in production.

It has been interesting evaluating the various modules: the sometimes unconventional or seemingly odd approach that they take, or the specific features that they offer. Not all of them are worth using, but at least they provide perspectives and some lessons for us to learn.

Of course, not all modules got reviewed. There are simply far more than 24 modules (lcpan tells me that there are 180 packages in the Getopt:: namespace alone, with 94 distributions having the name Getopt-*). I tried to cover at least the must-know ones, core ones, and the popular ones. Other than that, frankly the selection is pretty much random. I picked what's interesting to me or what I can make some points about, whether they are negative or positive points.

I have skipped many modules that are just yet another Getopt::Long wrapper which adds per-option usage or some other features found in Getopt::Long::Descriptive (GLD). Not that they are worse than GLD, for some reason or another they just didn't get adopted widely or at all. A couple examples of these: Getopt::Helpful, Getopt::Fancy.

Modules which use Moose, except MooseX::Getopt, automatically get skipped by me because their applicability is severely limited by the high number of dependencies and high startup overhead (200-500ms or even more on slower computers). These include: Getopt::Flex, Getopt::Alt, Getopt::Chain.

Some others are simply too weird or high in "WTF number", but I won't name names here.

Except for App::Cmd and App::Spec, I haven't really touched CLI frameworks in general. There are no shortages of CLI frameworks on CPAN too, perhaps for another series?

I've avoided reviewing my own modules, which include Getopt::Long::Complete (Getopt::Long wrapper which adds tab completion), Getopt::Long::Subcommand (Getopt::Long wrapper, with support for subcommands), Getopt::Long::More (my most recent Getopt::Long wrapper which adds tab completion and other features), Getopt::Long::Less & Getopt::Long::EvenLess (two leaner versions of Getopt::Long for the specific goal of reducing startup overhead), Getopt::Panjang (a break from Getopt::Long interface compatibility to explore new possibilities), and a CLI framework Perinci::CmdLine (which currently uses Getopt::Long but plans to switch backend in the long run; I've written a whole series of tutorial posts for this module).

In general, I'd say that you should probably try to stick with Getopt::Long first. As far as option parsing is concerned, it's packed with features already, and it has the advantage of being a core module. But as soon as you want: automatic autohelp/automessage generation, subcommand, tab completion then you should begin looking elsewhere.

Unfortunately except for evaluating Perl ports of some option parsing libraries (like Smart::Options, Getopt::ArgParse, Getopt::Kingpin), I haven't got the chance to deeply look into how option parsing is done in other languages. Among the other languages is Perl's own sister Perl 6, which offers built-in command-line option parsing. This endeavor of researching option parsing in other languages could potentially offer more lessons and perspectives.

I hope this series is of use to some people. Merry christmas and happy holidays to everybody.

Getopt modules 23: Getopt::Complete

About this mini-article series. Each day for 24 days, I will be reviewing a module that parses command-line options (such module is usually under the Getopt::* namespace). First article is here.

Getopt::Complete (GC) is a module written by Scott Smith (SAKOHT) in 2009 and also co-maintained by Nathan Nutter (NNUTTER). Last release is in 2011. So far it registers one CPAN distribution depending on it, although it's written by Scott himself.

Shell tab completion is a topic which I have been interested in since around 2012. I've released numerous modules related to completion, including two option parsing modules Getopt::Long::Complete (GLC) and Getopt::Long::More (GLM) which sports completion as (one of) its selling point, so it's natural that I want to compare them to Getopt::Complete. Throughout the article I'll be repeatedly doing those comparisons, and I hope it's not becoming too annoying.

Interface

GC, like GLC and GLM, is a Getopt::Long (GL) wrapper that adds tab completion feature. To let the module detect tab completion mode and return completion answer as soon as possible, GC offers this interface:

use Getopt::Complete (
    'frog'        => ['ribbit','urp','ugh'],
    'fraggle'     => sub { return ['rock','roll'] },
    'quiet!'      => undef,
    'name'        => undef,
    'age=n'       => undef,
    'outfile=s@'  => 'files',
    'outdir'      => 'directories',
    'runthis'     => 'commands',
    'username'    => 'users',
    ''          => 'directories',
);

That is, it accepts the options specification as import arguments. This looks simple but presents its own inconveniences.

The second thing you'll notice that the options specification are different than GL. While GLC and GLM choose to use an interface that is backward-compatible with GL, GC focuses on tab completion. The values of the pairs in the options specification is not a variable reference/coderef as you would expect in GL, but solely completion specification: it's either undef (meaning the option does not require argument), a string (meaning a completion type/routine to use, e.g. files to complete from filenames, commands to complete from program names in PATH, and so on. The options values themselves are collected in %ARGS.

Thus, compared to GLC and GLM, specifying completion routines is simpler in GC (but I also wrote Shell::Completer to provide the same level of convenience with more flexibility).

Activating Completion

To activate completion in bash, you need to declare this shell function first:

function _getopt_complete () {
COMPREPLY=($( COMP_CWORD=$COMP_CWORD perl `which ${COMP_WORDS[0]}` ${COMP_WORDS[@]:0} ));
}

then for each CLI application you also need to do:

% complete -F _getopt_complete myapp

This is different than the way you activate completion for GLC- or GLM-based scripts:

% complete -C myapp myapp

External programs receive raw COMP_LINE and COMP_POINT environment variables from bash when doing tab completion, while shell functions are provided with the already-parsed command-line COMP_WORDS array variable and COMP_CWORD. GC wants to avoid parsing the command-line on its own, so the _getopt_complete function is used to give the Perl program parsed command-line arguments in @ARGV, and COMP_CWORD in another environment variable.

Using command-line that is already parsed by bash in COMP_WORDS has its pros as well as cons, due to the way that bash parses command-line for COMP_WORDS. So I cannot say which way is better, but what I can say is parsing COMP_LINE ourselves is more flexible.

Completion behavior and bugs

When you press tab after the command:

% myapp <tab>

GC offers only completion from the <> specification. In the above example, it only offers list of directories as answer. On the other hand, GLC and GLM also shows the list of available option names. With GC, to list the available options, you have to do:

% myapp -<tab>

I also cannot say that GLC's and GLM's way is better, but it certainly makes the CLI program more discoverable. By just pressing Tab, a user (especially a new user) can know more about what's possible.

GC has still a few problems. First of all, it cannot complete "–opt=" when COMP_WORDBREAKS contains "=". I have put workarounds for this issue in GLC and GLM. Second, it cannot handle filenames/directory names with spaces, or quotes, and probably other special characters too.

Third, GLC and GLM through Complete::Util offers some matching algorithms aside from simple prefix matching, for extra convenience. This is not offered by GC.

Getopt modules 22: Getopt::Kingpin

About this mini-article series. Each day for 24 days, I will be reviewing a module that parses command-line options (such module is usually under the Getopt::* namespace). First article is here.

Getopt::Kingpin is a port of Go's kingpin library, written by Masaaki Takasago (TAKASAGO) in 2016. It offers the usual "nowadays standard" features like: short and long options with short option bundling, automatic help/usage message generation, specifying that an option is required, default value, and subcommands. Two extra features are: specifying that an option can be set via environment variable of a certain name, and built-in completion (which is a feature from the original library but doesn't seem to be implemented yet in the Perl port). The Go library also allows templating of help message, and this is not yet supported by Getopt::Kingpin.

Like Smart::Options (reviewed a couple of days ago), kingpin is using the so-called "fluent style" interface, a.k.a. chained methods, which I find annoying to type in Perl due to the method call operator in Perl being -> instead of a single dot. Although fortunately the chained methods interface is slightly less annoying than in Smart::Options.

After looking at the 3 ports of option parsing libraries (the abovementioned two plus Getopt::ArgParse reviewed yesterday) it indeed seems that subcommand support is becoming a standard thing. Which makes me think about whether Getopt::Long should also add such feature, or whether we should promote some other option parsing library as the "best practice" when one wants to do subcommands. So far, I'm not seeing any single best candidate for "Getopt::Long + subcommand support".

Getopt modules 21: Getopt::ArgParse

About this mini-article series. Each day for 24 days, I will be reviewing a module that parses command-line options (such module is usually under the Getopt::* namespace). First article is here.

In contrast to in Perl, where the core modules Getopt::Std and Getopt::Long stand the test of time and remain the most popular ways people parse command-line options with in their Perl CLI scripts, in Python we encounter several churns of recommended standard modules.

First there is getopt, "C-style parser for command line options". To use getopt, you pass a string containing list of short options a la Getopt::Std, e.g. "ho:v" (meaning -o takes argument while h and v are flag switches), and also an array containing long options, e.g. ["help", "output="] (meaning –output takes argument while –help does not). But, instead of supplying references to variables to set, or coderefs (remember, specifying anonymous function is inconvenient in Python) like in Getopt::Long, in getopt programmers are asked to do a manual if-then-else and a loop (see the linked documentation for example). This is also quite similar in interface to the GetoptLong class in Ruby.

No doubt, this style of programming feels manual and tedious. Thus came optparse which is more OO and supposedly more Pythonic. Instead of passing a whole list of options at once, you now add one option (object) at a time using add_option method, along with more information for each option: usage/help message, type, whether the option is required, number of arguments expected, default value, and perhaps some callback. optparse's capability is equivalent to Getopt::Long or Getopt::Long::Descriptive, except that optparse makes some design choices, for example it is decidedly Unix-oriented, allowing only or as the option prefix (while Getopt::Long allows you to configure this). The documentation is quite probably the nicest aspect of this module: it does not assume much knowledge (like familiarity with Unix or CLI) from the readers and explains at length what an option is and how should one design a CLI program with regards to accepting options. I realized that "required option" is indeed an oxymoron from reading it!

But, as with Getopt::Long, optparse does not have the concept of subcommands. Thus arrived argparse. It is basically like optparse in appearance, except it has some extra features like the ability to specify positional arguments (in Getopt::Long, this is handled by the <> option specification) and support nested subcommands with the use of subparsers. Interestingly, argparse supports reading arguments from a file just like Getopt::ArgvFile, and this is the only form of "config file" it supports.

As things are right now, argparse becomes part of the standard library (a.k.a. core modules, in Perl parlance) while optparse is now deprecated and might be removed. However, getopt remains.

There is a Perl port of argparse on CPAN called Getopt::ArgParse, created by M ytraM (MYTRAM) in 2013 and last updated in 2015. It is not feature-by-feature equivalent to its Python original, because of language differences and because argparse still accumulates features over time. You get some basic features like autohelp/autousage message, default value, setting an option as required, setting number of expected arguments, as well as subparsers for subcommand support (although not yet nested in Getopt::ArgParse). The type/validation feature is weak or almost nonexistent; perhaps a custom validation routine should be allowed to be specified or more can be explored here.

What's rather disappointing from this port is its use of Getopt::Long (I was expecting a full port so option parsing should be done by itself) and Moo, significantly adding dependencies.

There is mention of configuration file in the documentation, but actually there is no explicit support of configuration file. Not even using "option file prefix" ala argparse or Getopt::ArgvFile.

All in all, I'm not seeing something to make me prefer this module. If you do not use subcommands, I recommend sticking with Getopt::Long or Getopt::Long::Descriptive. If you do use subcommands, perhaps also consider a CLI framework like App::Cmd, or Getopt::Long::Subcommand.