pericmd 048: Showing table data in browser as sortable/searchable HTML table

The latest release of Perinci::CmdLine (1.68) supports viewing program’s output in an external program. And also a new output format is introduced: html+datatables. This will show your program’s output in a browser and table data is shown as HTML table using jQuery and DataTables plugin to allow you to filter rows or sort columns. Here’s a video demonstration:

Your browser does not support the video tag, or WordPress filters the VIDEO element.

If the video doesn’t show, here’s the direct file link.

pericmd 047: Special arguments (1): dry_run

In Rinci, function can express in its metadata that it supports various features or options. These feature-/option-related information will later be passed back to the function during function call in the form of special arguments. These arguments are prefixed with “-” (dash) with predefined names and values, and will only be passed if the function already expresses the support, and if the function accepts named arguments (as hash or hashref).

There are several such special arguments, one that I will cover today is -dry_run.

A function can express that it supports dry-run (simulation) mode, via the dry_run feature inside the features property in the Rinci function metadata:

$SPEC{delete_files} = {
    v => 1.1,
    args => {
        ...
    },
    features => {
        dry_run => 1,
    },
}

The special argument -dry_run need not be declared in the args property. It will automatically be passed when program is run in dry-run mode.

In Perinci::CdmLine, a common command-line option --dry-run will automatically be added if function supports dry_run feature. This means, if user passes --dry-run (or, alternatively, setting DRY_RUN environment variable to true), Perinci::CmdLine will call the function with -dry_run => 1.

If function is passed -dry_run => 1 in the arguments, it should perform the operation but without actually doing it. Lots of programs have this feature, like rsync, make, or svn merge (note: git merge also supports dry-run operation but with options named --no-commit --no-ff instead of --dry-run. They are useful for testing/trial, especially when the associated operation is rather dangerous (like deleting stuffs or sending mass email).

We could, of course, manually define a dry_run argument ourselves. But the advantage of specifying the dry_run feature instead is, aside from standardization and automatic addition of –dry-run and DRY_RUN parsing, is that in transactions, the dry-run functions can have special treatment. We will cover transaction in the future.

Here’s the full example:

#!/usr/bin/env perl

use 5.010;
use strict;
use warnings;
use Log::Any '$log';

use Perinci::CmdLine::Any;

our %SPEC;

$SPEC{delete_files} = {
    v => 1.1,
    args => {
        'file' => {
            schema => ['array*', of=>'str*', min_len=>1],
            req => 1,
            pos => 0,
            greedy => 1,
        },
    },
    features => {dry_run=>1},
};
sub delete_files {
    my %args = @_;
    my $verbose = $args{verbose};

    my $num_success = 0;
    my $num_fail = 0;
    for my $file (@{$args{file}}) {
        $log->infof("Deleting %s ...", $file);
        next if $args{-dry_run};
        if (unlink $file) {
            $num_success++;
        } else {
            $num_fail++;
            $log->warnf("Can't delete %s: %s", $file, $!);
        }
    }

    if ($num_fail == 0) {
        [200, "OK"];
    } elsif ($num_success == 0) {
        [500, "All failed"];
    } else {
        [200, "Some failed"];
    }
}

Perinci::CmdLine::Any->new(url=>'/main/delete_files', log=>1)->run;
% mkdir test
% cd test
% touch file1 file2 file3; mkdir dir1 dir2
% ls
dir1/  dir2/  file1  file2  file3

% ../delete-files --dry-run f*
[pericmd] Dry-run mode is activated
delete-files: Deleting file1 ...
delete-files: Deleting file2 ...
delete-files: Deleting file3 ...
% ls
dir1/  dir2/  file1  file2  file3

% ../delete-files --verbose f*
delete-files: Deleting dir1 ...
delete-files: Can't delete dir1: Is a directory
delete-files: Deleting dir2 ...
delete-files: Can't delete dir2: Is a directory
delete-files: Deleting file1 ...
delete-files: Deleting file2 ...
delete-files: Deleting file3 ...
% ls
dir1/  dir2/

pericmd 046: Customizing table output (2)

Continuing from previous post, if we use Perinci::CmdLine::Classic as a backend, there are a few other options to customize table output. Let’s use the same list-files script, but use the classic backend:

% PERINCI_CMDLINE_ANY=classic ./list-files -v

pericmd046-1

You’ll notice that compared to the default Perinci::CmdLine::Lite’s output (which uses Text::Table::Tiny to produce the table), the Perinci::CmdLine::Classic’s output (which uses Text::ANSITable) is a bit fancier, e.g. colors and boxchars (and/or Unicode characters).

By default, Text::ANSITable colors columns differently according to data type. The second column, since it contains only numbers and thus is a numeric column, is colored cyan by default. While string columns are colored light grey by default.

Of course, like the lite backend, the classic backend supports reordering columns:

% PERINCI_CMDLINE_ANY=classic ./list-files2 -v

pericmd046-2

% PERINCI_CMDLINE_ANY=classic FORMAT_PRETTY_TABLE_COLUMN_ORDERS='[["type","size","links"]]' ./list-files2 -v

pericmd046-3

Aside from FORMAT_PRETTY_TABLE_COLUMN_ORDERS, there’s also FORMAT_PRETTY_TABLE_COLUMN_TYPES:

% PERINCI_CMDLINE_ANY=classic FORMAT_PRETTY_TABLE_COLUMN_TYPES='[{"modified":"date"}]' ./list-files3 -v

pericmd046-4

The mentioned list-files3 is exactly the same as list-files2 except that it adds a column modified containing mtime Unix timestamp of file. By default will be shown as a number (cyan), but with the above FORMAT_PRETTY_TABLE_COLUMN_TYPES hint the column is shown as a date (yellow).

Note that there is some heuristics employed, so if you name the column “mtime” or “something_date”, you don’t have to give any hint to show the column as date.

There is also FORMAT_PRETTY_TABLE_COLUMN_FORMATS to apply some formatting to columns, for example:

% PERINCI_CMDLINE_ANY=classic FORMAT_PRETTY_TABLE_COLUMN_FORMATS='[{"size":[["num",{"style":"kilo"}]]}]' ./list-files2 -v

pericmd046-5

The POD for Data::Format::Pretty::Console describes these options in more details.

Aside from these, the Text::ANSITable module itself provides lots of options to configure its output. For example, to choose border style and color theme:

% PERINCI_CMDLINE_ANY=classic ANSITABLE_BORDER_STYLE="Default::csingle" ANSITABLE_COLOR_THEME="Tint::tint_red" ./list-files2 -v

pericmd046-6

With Text::ANSITable you can also customize cell padding/spacing, column widths, or alignments. You can hide some columns/rows, repeat some columns/rows, or even do conditional styles involving Perl code. For more available options, refer to the POD.

pericmd 045: Customizing table output (1)

Data structures like array of arrays of strings (aoaos), hash, or array of hashes of strings (aohos) will render as tables under Perinci::CmdLine. There are some ways to customize this table output, either from outside the script or from inside the script.

Let’s revisit the list-files script that made an appearance some posts ago (pericmd 039):

#!/usr/bin/env perl

use 5.010;
use strict;
use warnings;

use Perinci::CmdLine::Any;

our %SPEC;

$SPEC{list_files} = {
    v => 1.1,
    args => {
        'verbose' => {
            cmdline_aliases => {v=>{}},
            schema => 'bool',
        },
        'all' => {
            cmdline_aliases => {a=>{}},
            schema => 'bool',
        },
    },
};
sub list_files {
    my %args = @_;
    my $verbose = $args{verbose};
    my $all     = $args{all};

    my @files;
    opendir my($dh), ".";
    for (sort readdir($dh)) {
        next if !$all && /\A\./;
        if ($verbose) {
            my $type = (-l $_) ? "l" : (-d $_) ? "d" : (-f _) ? "f" : "?";
            push @files, {name=>$_, size=>(-s _), type=>$type};
        } else {
            push @files, $_;
        }
    }

    [200, "OK", \@files];
}

my $app = Perinci::CmdLine::Any->new(url => '/main/list_files');
delete $app->common_opts->{verbose};
$app->common_opts->{version}{getopt} = 'version|V';
$app->run;

When we run this script:

% ./list-files -v --format json-pretty
[
   200,
   "OK",
   [
      {
         "name" : "hello",
         "size" : 1131,
         "type" : "f"
      },
      {
         "name" : "list-files",
         "size" : 988,
         "type" : "f"
      },
      {
         "name" : "list-files~",
         "size" : 989,
         "type" : "f"
      },
      {
         "name" : "mycomp",
         "size" : 902,
         "type" : "f"
      },
      {
         "name" : "mycomp2a",
         "size" : 608,
         "type" : "f"
      },
      {
         "name" : "mycomp2b",
         "size" : 686,
         "type" : "f"
      },
      {
         "name" : "mycomp2b+comp",
         "size" : 1394,
         "type" : "f"
      },
      {
         "name" : "pause",
         "size" : 4096,
         "type" : "d"
      },
      {
         "name" : "perl-App-hello",
         "size" : 4096,
         "type" : "d"
      }
   ],
   {}
]

%  ./list-files -v
+----------------+------+------+
| name           | size | type |
+----------------+------+------+
| hello          | 1131 | f    |
| list-files     | 988  | f    |
| list-files~    | 989  | f    |
| mycomp         | 902  | f    |
| mycomp2a       | 608  | f    |
| mycomp2b       | 686  | f    |
| mycomp2b+comp  | 1394 | f    |
| pause          | 4096 | d    |
| perl-App-hello | 4096 | d    |
+----------------+------+------+

Column order

We didn’t specify the ordering of columns, because our data is an array of hashes (instead of array of arrays). But in this case, the order happens to be the way we want (filename, then size and type). By default, the order is asciibetical. But if we modify the script and add another field links (for number of hardlinks):

#!/usr/bin/env perl

use 5.010;
use strict;
use warnings;

use Perinci::CmdLine::Any;

our %SPEC;

$SPEC{list_files} = {
    v => 1.1,
    args => {
        'verbose' => {
            cmdline_aliases => {v=>{}},
            schema => 'bool',
        },
        'all' => {
            cmdline_aliases => {a=>{}},
            schema => 'bool',
        },
    },
};
sub list_files {
    my %args = @_;
    my $verbose = $args{verbose};
    my $all     = $args{all};

    my @files;
    opendir my($dh), ".";
    for (sort readdir($dh)) {
        next if !$all && /\A\./;
        if ($verbose) {
            my $is_sym = (-l $_); # will do an lstat
            my @st = stat($_);
            my $type = $is_sym ? "l" : (-d _) ? "d" : (-f _) ? "f" : "?";
            push @files, {name=>$_, size=>(-s _), type=>$type, links=>$st[3]};
        } else {
            push @files, $_;
        }
    }

    [200, "OK", \@files];
}

my $app = Perinci::CmdLine::Any->new(url => '/main/list_files');
delete $app->common_opts->{verbose};
$app->common_opts->{version}{getopt} = 'version|V';
$app->run;

then the result will be:

% ./list-files -v
+-------+----------------+------+------+
| links | name           | size | type |
+-------+----------------+------+------+
| 1     | hello          | 1131 | f    |
| 1     | list-files     | 988  | f    |
| 1     | list-files2    | 1086 | f    |
| 1     | list-files2~   | 988  | f    |
| 1     | list-files~    | 989  | f    |
| 1     | mycomp         | 902  | f    |
| 1     | mycomp2a       | 608  | f    |
| 1     | mycomp2b       | 686  | f    |
| 1     | mycomp2b+comp  | 1394 | f    |
| 6     | pause          | 4096 | d    |
| 5     | perl-App-hello | 4096 | d    |
+-------+----------------+------+------+

What if we want the name column to stay as the leftmost? Here’s also where the result metadata comes in handy. From inside the script (function), we can embed this formatting hints when returning the enveloped result as follow:

[200, "OK", \@files, {
    format_options => {any => {table_column_orders=>[[qw/name type links size/]]}},
}];

OK, that’s a mouthful. What the code above does is add a key to the result metadata (the fourth element of the enveloped result array, a hash) called format_options. The value of this key is a hash of format names and format specifications. We’ll use any for the format name to apply to any format (but you actually can specify different formatting for text vs for json and so on).

The format specification is another hash containing a key called table_column_orders. This key has a value of array of arrays (to be able to specify multiple tables). One element of that array contains the list of columns for our table: [qw/name type links size/]. Since the output table’s columns match this entry, the order is followed.

Aside from inside the script itself, you can actually specify the ordering from an environment variable (outside the script). For example:

% FORMAT_PRETTY_TABLE_COLUMN_ORDERS='[["size","links","type","name"]]' ./list-files2 -v
+------+-------+------+----------------+
| size | links | type | name           |
+------+-------+------+----------------+
| 1131 | 1     | f    | hello          |
| 988  | 1     | f    | list-files     |
| 1187 | 1     | f    | list-files2    |
| 1086 | 1     | f    | list-files2~   |
| 989  | 1     | f    | list-files~    |
| 902  | 1     | f    | mycomp         |
| 608  | 1     | f    | mycomp2a       |
| 686  | 1     | f    | mycomp2b       |
| 1394 | 1     | f    | mycomp2b+comp  |
| 4096 | 6     | d    | pause          |
| 4096 | 5     | d    | perl-App-hello |
+------+-------+------+----------------+

The value of the environment variable is a JSON-encoded array of arrays, just like in table_column_orders format specification above.

If we use the Perinci::CmdLine::Classic backend (which renders tables using Text::ANSITable), there are a few other options available to customize the table. We’ll discuss this in another blog post.

pericmd 044: Customizing output

The functions we use as backend of our CLI application return pure data structure, and Perinci::CmdLine’s formatter figures out how to best display this information. There are, however, some ways to customize how the output looks in our CLI application by setting some attributes in the result metadata.

As you might remember, result metadata is the fourth element in the enveloped result structure:

[$status, $message, $actual_result, $meta]

The result metadata is a hash (a DefHash actually, but for most purposes you don’t care about the difference). There are some attributes (keys) you can set in this metadata to give hints to Perinci::CmdLine on how to render the result in CLI application.

cmdline.result

The first one is cmdline.result. This sets alternative result to use when in CLI context. For example:

sub func {
    [200, "OK", "foo", {'cmdline.result'=>'bar'}];
}

This way, if you are calling the function, you’ll get “foo” (in the third element), but if this function is run on the command-line, user will see “bar”.

Why would this be useful? An example would be functions that return bool values, like for example user_exists(). In Perl, we probably will only care about getting 1/0. But in CLI, you might want to display a more user-friendly message. So instead of:

% user-exists ujang
0
% user-exists kadek
1

If your function does this:

sub user_exists {
    my %args = @_;
    my $exists = actual_check_for_existence($args{user});
    [200, "OK", $exists, {'cmdline.result' => "User $args{user}" . ($exists ? " exists":"does not exist")}];
}

then you can have:

% user-exists ujang
User ujang does not exist
% user-exists kadek
User kadek exists

Another example where this is applied is in
Git::Bunch
. In function check_bunch, the result is a hash of every repo in the bunch and their check statuses, e.g.:

[200, "OK", {repo1=>[200,"clean"], repo2=>[500,"Needs commit"], ...}]

The function also happens to use progress bar to report unclean repositories as the checking is being done. Unclean repos get reported/logged to the screen. Thus, it is not very useful to display this hash on the CLI (but useful when we are using the function from Perl). So check_bunch() sets the CLI output to empty string:

[200, "OK", ..., {'cmdline.result'=>''}]

cmdline.default_format

This attribute picks the default format. For example:

[200, "OK", ..., {'cmdline.default_format'=>'json'}]

This way, when CLI is run, the output defaults to JSON instead of text, unless user explicitly specify the output format that she wants, e.g. --format text.

One common use-case for this is to force the simple or pretty version of text format. By default, for DWIM-ness, the text format becomes simpler when the program is run through pipes (e.g. formatted ASCII table becomes lines of tab-separated values). For example (I’m using the list-files script mentioned in pericmd 039):

% list-files -v
+----------------+------+------+
| name           | size | type |
+----------------+------+------+
| hello          | 1131 | f    |
| list-files     | 988  | f    |
| list-files2    | 1187 | f    |
| mycomp         | 902  | f    |
| mycomp2a       | 608  | f    |
| mycomp2b       | 686  | f    |
| mycomp2b+comp  | 1394 | f    |
| pause          | 4096 | d    |
| perl-App-hello | 4096 | d    |
+----------------+------+------+

% list-files -v | cat
hello   1131    f
list-files      988     f
list-files2     1187    f
mycomp  902     f
mycomp2a        608     f
mycomp2b        686     f
mycomp2b+comp   1394    f
pause   4096    d
perl-App-hello  4096    d

Sometimes you always want to default to the pretty version (even though your CLI program is run through pipes), and sometimes the other way around. To do this you can instruct in the result metadata 'cmdline.default_format' => 'text-pretty' (or text-simple).

Note that the cmdline.default_format attribute can also be specified in the Rinci function metadata, but specifying this in the result metadata is more flexible as we can customize on a per-invocation basis.

cmdline.exit_code

This is not actually related to output format, but somewhat related. This attribute explicitly chooses an exit code for the CLI program. By default, as you might also remember, status code is determined as follow: “if status is 2xx or 304, then 0, else status-300”.

cmdline.skip_format

If you set this attribute to true, the result will be printed as-is without any formatting. You might want to use this if you are outputting a preformatted text. Which defeats the whole point of convenience given by Perinci::CmdLine, but sometimes it’s useful.

cmdline.page_result and cmdline.pager

This is also not directly related to formatting, but somewhat related. If you set cmdline.page_result to true, you can instruct Perinci::CmdLine to run a pager (like less). This might be useful for programs that output long text. The cmdline.pager can be used to specifically choose another program instead of the default $ENV{PAGER} (or less).

In the next blog post I’ll discuss more ways to customize table output.

pericmd 043: Generating CLI applications (App::GenPericmdScript)

Most Perinci::CmdLine-based CLI scripts are basically a variation of:

#!perl

use Perinci::CmdLine::Any;
Perinci::CmdLine->new(
    url => '/some/riap/url/to/function',
    ...
)->run;

Due to my laziness and strict adherence to the DRY principle, I create a script gen-pericmd-script (distributed with App::GenPericmdScript) to generate this boilerplate. To see it in action, first install Perinci::Examples (if you haven’t done so) and then run:

% gen-pericmd-script /Perinci/Examples/gen_array

The result is spewed to standard output:

#!/mnt/home/s1/perl5/perlbrew/perls/perl-5.18.4/bin/perl

# Note: This script is a CLI interface to Riap function /Perinci/Examples/gen_array
# and generated automatically using App::GenPericmdScript version 0.04

# DATE
# VERSION

use 5.010001;
use strict;
use warnings;

use Perinci::CmdLine::Any;

Perinci::CmdLine::Any->new(
    url => "/Perinci/Examples/gen_array",
)->run;

# ABSTRACT: Generate an array of specified length
# PODNAME: script

If you analyze the output, the abstract is also written for you. This is taken from the Rinci metadata which is retrieved by gen-pericmd-script via a Riap meta request.

If you specify -o option, e.g. -o /home/s1/bin/gen-array, the generated script is written to the specified path and also set chmod 0755 as well as tab completion is activated (if you have shcompgen installed). There are of course several options to customize the script, like the Perinci::CmdLine backend module to use, whether to activate logging, specify subcommands, whether to add some code before instantiating Perinci::CmdLine object, and so on.

App::GenPericmdScript is actually best used with Dist::Zilla. There’s a plugin called DZP:Rinci::ScriptFromFunc which uses to App::GenPericmdScript to generate scripts for you during build. If some have a dist.ini like this:

name=App-GenArray
version=0.01

[Rinci::ScriptFromFunc]
script= url=/Perinci/Examples/gen_array

[@Classic]

[PodWeaver]
config_plugin=-Rinci

After you run dzil build, you’ll get something like this in App-GenArray-0.01/bin/gen-array:

#!perl

# Note: This script is a CLI interface to Riap function /Perinci/Examples/gen_array
# and generated automatically using App::GenPericmdScript version 0.04

# DATE
# VERSION

use 5.010001;
use strict;
use warnings;

use Perinci::CmdLine::Any;

$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0;

Perinci::CmdLine::Any->new(
    url => "/Perinci/Examples/gen_array",
)->run;

# ABSTRACT: Generate an array of specified length
# PODNAME: gen-array

__END__

=pod

=head1 SYNOPSIS

Usage:

 % gen-array [options] <len>

=head1 DESCRIPTION

Also tests result schema.

=head1 OPTIONS

C<*> marks required options.

=over

=item B<--config-path>=I<s>

Set path to configuration file.

Can be specified multiple times.

=item B<--config-profile>=I<s>

Set configuration profile to use.

=item B<--format>=I<s>

Choose output format, e.g. json, text.

=item B<--help>, B<-h>, B<-?>

Display this help message.

=item B<--json>

Set output format to json.

=item B<--len>=I<i>*

Array length.

Default value:

 10

=item B<--naked-res>

When outputing as JSON, strip result envelope.

By default, when outputing as JSON, the full enveloped result is returned, e.g.:

    [200,"OK",[1,2,3],{"func.extra"=>4}]

The reason is so you can get the status (1st element), status message (2nd
element) as well as result metadata/extra result (4th element) instead of just
the result (3rd element). However, sometimes you want just the result, e.g. when
you want to pipe the result for more post-processing. In this case you can use
`--naked-res` so you just get:

    [1,2,3]


=item B<--no-config>

Do not use any configuration file.

=item B<--version>, B<-v>

=back

=head1 ENVIRONMENT

GEN_ARRAY_OPT

=head1 FILES

~/gen-array.conf

/etc/gen-array.conf

=cut

When you run perldoc on this script, you’ll get something like:

GEN-ARRAY(1)               User Contributed Perl Documentation               GEN-ARRAY(1)



SYNOPSIS
       Usage:

        % gen-array [options] <len>

DESCRIPTION
       Also tests result schema.

OPTIONS
       "*" marks required options.

       --config-path=s
           Set path to configuration file.

           Can be specified multiple times.

       --config-profile=s
           Set configuration profile to use.

       --format=s
           Choose output format, e.g. json, text.

       --help, -h, -?
           Display this help message.

       --json
           Set output format to json.

       --len=i*
           Array length.

           Default value:

            10

       --naked-res
           When outputing as JSON, strip result envelope.

           By default, when outputing as JSON, the full enveloped result is returned,
           e.g.:

               [200,"OK",[1,2,3],{"func.extra"=>4}]

           The reason is so you can get the status (1st element), status message (2nd
           element) as well as result metadata/extra result (4th element) instead of just
           the result (3rd element). However, sometimes you want just the result, e.g.
           when you want to pipe the result for more post-processing. In this case you
           can use `--naked-res` so you just get:

               [1,2,3]

       --no-config
           Do not use any configuration file.

       --version, -v

ENVIRONMENT
       GEN_ARRAY_OPT

FILES
       ~/gen-array.conf

       /etc/gen-array.conf

pericmd 042: Using functions from other languages

Since Perinci::CmdLine uses Riap behind the scenes (from getting the Rinci metadata to calling the function), it is possible to use a remote server as the Riap server, even when the server side is not Perl. Below are two examples. The first one uses piping (stdin/stdout) to access a Ruby program on the same server, and the second one uses TCP server written in Node.js. Note that the two programs are just quick-hacks and very ad-hoc, I haven’t actually developed any Riap libraries on those languages. Their main goal is to demonstrate the simplicity of the Riap::Simple protocol.

Ruby over pipe

Save this code to /some/path/to/riap_server.rb:

#!/usr/bin/env ruby

require 'json'

def _res(res)
  res[3] ||= {}
  res[3]['riap.v'] ||= 1.1
  puts "j" + res.to_json
  $stdout.flush
end

while line = $stdin.gets do
  if line =~ /^j(.+)/
    begin
      req = JSON.parse($1)
    rescue Exception => e
      _res [400, "Invalid JSON in Riap request: " + e.message]
      next
    end

    if !req['action']
      _res [400, "Please specify 'action'"]
      next
    end

    if !req['uri']
      _res [400, "Please specify 'uri'"]
      next
    end

    if req['action'] == 'call'
      if req['uri'] == '/cat_array'
        args = req['args'] || {}
        if (!args['a1'])
          _res [400, "Please specify a1"]
          next
        elsif (!args['a2'])
          _res [400, "Please specify a1"]
          next
        end
        _res [200,"OK",args['a1'] + args['a2']]
        next
      else
        _res [404, "Unknown uri"]
        next
      end

    elsif req['action'] == 'meta'
      if req['uri'] == '/cat_array'
        _res [200,"OK",{
                "v" => 1.1,
                "summary" => "Concatenate two arrays together",
                "args" => {
                  "a1" => {
                    "summary" => "First array",
                    "schema" => ["array"],
                    "req" => true,
                  },
                  "a2" => {
                    "summary" => "Second array",
                    "schema" => ["array"],
                    "req" => true,
                  },
                }}]
        next
      else
        _res [404, "Unknown uri"]
        next
      end

    elsif req['action'] == 'info'
      if req['uri'] == '/cat_array'
        _res [200,"OK",{"type" => "function", "uri" => "/foo"}]
        next
      else
        _res [404, "Unknown uri"]
        next
      end

    else
      _res [400, "Invalid action"]
      next
    end

  else
    _res [400, "Invalid Riap request"]
    break
  end
end

Now create our CLI program, let’s call it cat-array-ruby:

#!/usr/bin/env perl

use Perinci::CmdLine::Classic;
Perinci::CmdLine::Classic->new(
    url => "riap+pipe:/some/path/to/riap_server.rb////cat_array",
)->run;

Let’s test the CLI program:

% cat-array-ruby --help
cat-array-ruby - Concatenate two arrays together                                                     
Usage                                                                                    
  -e --help (or -h, -?)                                                                  
  -e --version (or -v)                                                                   
  -e [options]                                                                           
Options                                                                                  
  --a1-json=s                                --a1-yaml=s                                 
  --a1=s*                                    --a2-json=s                                 
  --a2-yaml=s                                --a2=s*                                     
  --config-path=s                            --config-profile=s                          
  --debug                                    --format-options=s                          
  --format=s                                 --help, -h, -?                              
  --json                                     --log-level=s                               
  --no-config                                --quiet                                     
  --trace                                    --verbose                                   
  --version, -v                                                                          
For more complete help, use '--help --verbose'.                   

% cat-array-ruby --a1-json '[1,2,3]' --a2-json '[4,5,6]'
┌─────────────────────────────┐
│  1    2    3    4    5    6 │
└─────────────────────────────┘

All the other features you would normally get from a Perinci::CmdLine-based CLI application, like tab completion, output formatting, and so on works.

Node.js over TCP server

Save this code to riap_server.js:

function _res(s, res) {
    if (!res[3]) res[3] = {};
    res[3]['riap.v'] = 1.1;
    s.write("j" + JSON.stringify(res) + "\015\012");
    return;
}

var humanize = require('humanize');
var net = require('net');
var rl = require('readline');
var server = net.createServer(function(socket) { //'connection' listener
    console.log('client connected');
    socket.on('end', function() {
        console.log('client disconnected');
    });
    var i = rl.createInterface(socket, socket);
    i.on('line', function (line) {
        match = line.match(/^j(.+)/)
        if (match) {
            // XXX error handling?
            var req = JSON.parse(match[1]);
            if (!req['action']) {
                _res(socket, [400, "Please specify action"]);
            } else if (!req['uri']) {
                _res(socket, [400, "Please specify uri"]);

            } else if (req['action'] == 'call') {
                var args = req['args'] || {}
                if (req['uri'] == '/humanize/filesize') {
                    if (!args['size']) {
                        _res(socket, [400, "Please specify size"]);
                    } else {
                        _res(socket, [200, "OK", humanize.filesize(args['size'])]);
                    }
                } else {
                    _res(socket, [404, "Unknown uri"]);
                }

            } else if (req['action'] == 'meta') {
                if (req['uri'] == '/humanize/filesize') {
                    _res(socket, [200, "OK", {
                        "v": 1.1,
                        "summary": "Humanize file size",
                        "args": {
                            "size": {
                                "schema": ["int"],
                                "req": true,
                                "pos": 0
                            }
                        }
                    }]);
                } else {
                    _res(socket, [404, "Unknown uri"]);
                }

            } else if (req['action'] == 'info') {
                if (req['uri'] == '/humanize/filesize') {
                    _res(socket, [200, "OK", {"uri":"/humanize/filesize", "type":"function"}])
                } else {
                    _res(socket, [404, "Unknown uri"]);
                }

            } else {
                _res(socket, [400, "Unknown action"]);
            }
        } else {
            _res(socket, [400, "Invalid Riap request"]);
            socket.destroy();
        }
    });
});
server.listen(5000, function() { //'listening' listener
    console.log('server bound');
});

Install the humanize NPM module (if you doesn’t have the module) and run the server:

% npm install humanize
% node riap_server.js
server bound

Prepare our client, let’s call it humanize-filesize:

#!/usr/bin/env perl

use Perinci::CmdLine::Classic;
Perinci::CmdLine::Classic->new(
    url => "riap+tcp://localhost:5000/humanize/filesize",
)->run;

Run our CLI:

% humanize-filesize --help
humanize-filesize - Humanize file size                                      
Usage                                                                                    
  -e --help (or -h, -?)                                                                  
  -e --version (or -v)                                                                   
  -e [options] <size>                                                                    
Options                                                                                  
  --config-path=s                            --config-profile=s                          
  --debug                                    --format-options=s                          
  --format=s                                 --help, -h, -?                              
  --json                                     --log-level=s                               
  --no-config                                --quiet                                     
  --size=i* (=arg[0])                        --trace                                     
  --verbose                                  --version, -v                               
For more complete help, use '--help --verbose'.                        

% humanize-filesize
ERROR 400: Missing required argument(s): size

% humanize-filesize 100200300
95.56 MB

% humanize-filesize 100200300 --json
[
   200,
   "OK",
   "95.56 MB",
   {
      "riap.v": 1.1
   }
]

Note that in this blog post we are using Perinci::CmdLine::Classic instead of Perinci::CmdLine::Any because the default backend Perinci::CmdLine::Lite does not yet support the URL schemes riap+pipe:/ or riap+tcp://. This will be rectified sometime in the future.

pericmd 040: Riap

Perinci::CmdLine, as it name reflects, centers around the concept of Rinci. Functions, as well as packages (and variables and other types of code entities) are added with rich metadata so tools interacting with them can have a better idea about the details of the entities and do useful things with them.

In Perl module/script, Rinci metadata is placed in a package variable called %SPEC, with the name of the function or variable (along with its sigil) serves as the key (for package, the key is :package).

But, for flexibility Rinci metadata should be able to be put elsewhere, even remotely.

Thus Riap, as the other side of the coin, is born. It is a client-server, request-response protocol to exchange Rinci metadata or do things with code entities. Request is represented with a hash, with the minimum keys of: v (protocol version, default to 1.1), uri (location to code entity), action.

The Riap response is an enveloped result, which has been discussed previously (pericmd 013). Riap adds several keys in the result metadata (fourth element), mainly: riap.v (server Riap protocol version), and some optional others.

The server side is viewed as a tree of code entities, with packages having the ability to contain other subentities.

The URL can be schemeless path like /WWW/PAUSE/Simple/ (notice the ending slash) which in Perl application maps to Perl package (WWW::PAUSE::Simple) or /WWW/PAUSE/Simple/list_files (notice the lack of ending slash) which maps to a Perl function (or variable, or other non-package entity).

meta

The meta action is one of the most important actions. This is a request for Rinci metadata itself. So a Riap request like {action=>"meta", uri=>"/WWW/PAUSE/Simple/"} will return a response of something like below (this is providing you have installed WWW::PAUSE::Simple from CPAN):

[
   200,
   "OK (meta action)",
   {
      "entity_date" : "2015-02-26",
      "entity_v" : "0.07",
      "summary" : "An API for PAUSE",
      "v" : 1.1
   },
   {}
]

How about request for a function metadata: {action=>"meta", uri=>"/WWW/PAUSE/Simple/list_files"}? This might return something like:

[
   200,
   "OK (meta action)",
   {
      "args" : {
         "del" : {
            "schema" : [
               "bool",
               {},
               {}
            ],
            "summary" : "Only list files which are scheduled for deletion",
            "summary.alt.bool.not" : "Only list files which are not scheduled for deletion",
            "tags" : [
               "category:filtering"
            ]
         },
         "detail" : {
            "schema" : [
               "bool",
               {},
               {}
            ],
            "summary" : "Whether to return detailed records"
         },
         "file" : {
            "greedy" : 1,
            "pos" : 0,
            "schema" : [
               "array",
               {
                  "of" : "str*",
                  "req" : 1
               },
               {}
            ],
            "summary" : "File name/wildcard pattern"
         },
         "password" : {
            "is_password" : 1,
            "req" : 1,
            "schema" : [
               "str",
               {
                  "req" : 1
               },
               {}
            ],
            "summary" : "PAUSE password",
            "tags" : [
               "common"
            ]
         },
         "username" : {
            "req" : 1,
            "schema" : [
               "str",
               {
                  "match" : "\\A\\w{2,9}\\z",
                  "max_len" : 9,
                  "req" : 1
               },
               {}
            ],
            "summary" : "PAUSE ID",
            "tags" : [
               "common"
            ]
         }
      },
      "args_as" : "hash",
      "entity_date" : "2015-02-26",
      "entity_v" : "0.07",
      "result_naked" : 0,
      "summary" : "List files on your PAUSE account",
      "v" : "1.1",
      "x.perinci.sub.wrapper.logs" : [
         {
            "normalize_schema" : 1,
            "validate_args" : 1,
            "validate_result" : 1
         }
      ]
   },
   {}
]

You can use a convenient command-line utility called riap to launch Riap requests and browse around a Riap server as if it were a filesystem (install the utility via cpanm -n App::riap), e.g.:

% riap

riap / > meta /WWW/PAUSE/Simple/
┌────────────────────────────────┒
│ key           value            ┃
│                                ┃
│ entity_date   2015-02-26       ┃
│ entity_v      0.07             ┃
│ summary       An API for PAUSE ┃
│ v             1.1              ┃
┕━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

riap / > meta /WWW/PAUSE/Simple/ --json 
[
   200,
   "OK (meta action)",
   {
      "entity_date": "2015-02-26",
      "entity_v": "0.07",
      "summary": "An API for PAUSE",
      "v": 1.1
   }
]

call

Aside from meta, call is another important action. It is used to call a function and return its result. Another request key args (hash) can be added to specify function arguments. A Riap request like {action=>"call", uri=>"/Perinci/Examples/gen_array", args=>{len=>5}} might return (assuming you have installed Perinci::Examples from CPAN):

[200,"OK",[4,2,1,3,5]]

Obviously, only functions can be called. If you try to call a non-function entity, an error will be returned:

% riap

riap / > cd /Perinci
riap /Perinci > call Examples/
ERROR 501: Action 'call' not implemented for 'package' entity (at ... line ...)

Note that the riap utility regards Riap packages as directories and Riap functions as executable files, so instead of:

% riap

riap / > call /Perinci/Examples/gen_array --args '{"len":5}'
┌────────────────────────┒
│  3    1    2    3    5 ┃
┕━━━━━━━━━━━━━━━━━━━━━━━━┛

you can also “run” a function like it is an executable program:

riap / > /Perinci/Examples/gen_array --len 5 --json 
[
   200,
   "OK",
   [
      "2",
      "1",
      "4",
      "3",
      "2"
   ],
   {}
]

pericmd 039: Creating API-friendly CLI applications with parseable outputs

Traditional Unix commands present their result in a nice text output, often tabular. Examples:

% fdisk -l
Disk /dev/sda: 250.0 GB, 250059350016 bytes 255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device     Boot  Start  End    Blocks      Id  System
/dev/sda1  *     1      191    1534176     83  Linux
/dev/sda2        192    2231   16386300    83  Linux
/dev/sda3        2232   3506   10241437+   83  Linux
/dev/sda4        3507   30401  216034087+  5   Extended
/dev/sda5        3507   3767   2096451     82  Linux swap / Solaris
/dev/sda6        3768   3832   522081      83  Linux
/dev/sda7        3833   30401  213415461   83  Linux Disk

/dev/sdb: 250.0 GB, 250059350016 bytes 255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device     Boot  Start  End    Blocks      Id  System
/dev/sdb1  *     1      30401  244196001   83  Linux
% top -b
top - 13:34:07 up 23 days,  9:54, 17 users,  load average: 0.32, 0.35, 0.40
Tasks: 253 total,   1 running, 252 sleeping,   0 stopped,   0 zombie
%Cpu(s):  4.5 us,  2.6 sy, 15.5 ni, 76.4 id,  0.9 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:  12240268 total, 12015528 used,   224740 free,  1993828 buffers
KiB Swap:        0 total,        0 used,        0 free,  2537996 cached

  PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND
 6182 s1        20   0 2731m 1.4g  22m S  12.6 12.1 987:50.33 opera
 2173 s1        20   0 25196 1504 1020 R   6.3  0.0   0:00.01 top
 5500 root      20   0  440m 282m  32m S   6.3  2.4   3414:24 Xorg
 5634 s1        20   0 1117m  20m 7648 S   6.3  0.2   8:54.64 Thunar
 5646 s1        20   0  382m  12m 6520 S   6.3  0.1   0:40.34 xfdesktop
 6054 s1        20   0  912m  92m  23m S   6.3  0.8  65:20.28 konsole
23398 s2        20   0 1951m 656m  38m S   6.3  5.5  37:25.52 iceweasel
    1 root      20   0 10768  784  648 S   0.0  0.0   0:11.71 init
    2 root      20   0     0    0    0 S   0.0  0.0   0:01.01 kthreadd
...
% ls -l
total 100
-rw-r--r-- 1 s1 s1 14754 Feb 27 20:35 Changes
-rw-r--r-- 1 s1 s1  1126 Feb 27 20:35 dist.ini
drwxr-xr-x 4 s1 s1  4096 Jul 23  2014 lib/
drwxr-xr-x 4 s1 s1  4096 Feb 27 20:35 Perinci-CmdLine-Lite-0.88/
-rw-r--r-- 1 s1 s1 61148 Feb 27 20:35 Perinci-CmdLine-Lite-0.88.tar.gz
drwxr-xr-x 2 s1 s1  4096 Feb 27 21:57 t/
-rw-r--r-- 1 s1 s1    21 Aug 26  2014 weaver.ini
% df
Filesystem                                              1K-blocks       Used Available Use% Mounted on
rootfs                                                  115378728   79636360  35742368  70% /
udev                                                        10240          0     10240   0% /dev
tmpfs                                                     1224028        916   1223112   1% /run
/dev/disk/by-uuid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  115378728   79636360  35742368  70% /
tmpfs                                                        5120          0      5120   0% /run/lock
tmpfs                                                     2448040        708   2447332   1% /run/shm
/dev/sdb1                                              1922859912 1910136652  12723260 100% /backup
/dev/mapper/0                                            41284928   38735524    452252  99% /mnt
/dev/mapper/2                                            82569904   55356304  23019296  71% /backup/mnt

While easy on the human eyes, they are “hard” to parse. I quote hard because it depends on the applications: some just require a Perl’s split(/\s+/) or cut because they are a variation of tab-separated values or CSV (but remember that correctly parsing CSV is hard), while some require a custom-crafted complex regex with additional logic.

In the age where Perl’s popularity has waned and there is less emphasis on regex skills, or the age of webdevs and devops, correctly parsing CLI programs’ outputs can be quite challenging to some. Many developers are more familiar with HTTP API’s and JSON or the like, where they do not have to parse anything and get the data they want in the form of data structures (arrays, hashes, or a combination of both).

Some Unix commands do offer an option to produce more parse-friendly output, but this is rare.

With Perinci::CmdLine we get the best of both world. Your function (the backend for the CLI app) produces pure data structures. Formatting is left to the framework to figure out. When run as CLI app, your users still get a nice table output that is easy on the eyes. But when they need to parse the output or feed it to another program, or access the function via API, there is an option to produce the data structure (--json switch in CLI).

A very simple demonstration. Save the script below to list-files:

use Perinci::CmdLine::Any;

our %SPEC;

$SPEC{list_files} = {
    v => 1.1,
    args => {
        'verbose' => {
            cmdline_aliases => {v=>{}},
            schema => 'bool',
        },
        'all' => {
            cmdline_aliases => {a=>{}},
            schema => 'bool',
        },
    },
};
sub list_files {
    my %args = @_;
    my $verbose = $args{verbose};
    my $all     = $args{all};

    my @files;
    opendir my($dh), ".";
    for (sort readdir($dh)) {
        next if !$all && /\A\./;
        if ($verbose) {
            my $type = (-l $_) ? "l" : (-d $_) ? "d" : (-f _) ? "f" : "?";
            push @files, {name=>$_, size=>(-s _), type=>$type};
        } else {
            push @files, $_;
        }
    }

    [200, "OK", \@files];
}

my $app = Perinci::CmdLine::Any->new(url => '/main/list_files');
delete $app->common_opts->{verbose};
$app->common_opts->{version}{getopt} = 'version|V';
$app->run;
% ./list-files
hello
list-files
list-files~
mycomp
mycomp2a
mycomp2b
mycomp2b+comp
pause
perl-App-hello

% ./list-files -v
+----------------+------+------+
| name           | size | type |
+----------------+------+------+
| hello          | 1131 | f    |
| list-files     | 988  | f    |
| list-files~    | 989  | f    |
| mycomp         | 902  | f    |
| mycomp2a       | 608  | f    |
| mycomp2b       | 686  | f    |
| mycomp2b+comp  | 1394 | f    |
| pause          | 4096 | d    |
| perl-App-hello | 4096 | d    |
+----------------+------+------+

% ./list-files --json
[200,"OK",["hello","list-files","list-files~","mycomp","mycomp2a","mycomp2b","mycomp2b+comp","pause","perl-App-hello"],{}]

% ./list-files -v --format json-pretty --naked-res
[
   {
      "name" : "hello",
      "size" : 1131,
      "type" : "f"
   },
   {
      "name" : "list-files",
      "size" : 988,
      "type" : "f"
   },
   {
      "name" : "list-files~",
      "size" : 989,
      "type" : "f"
   },
   {
      "name" : "mycomp",
      "size" : 902,
      "type" : "f"
   },
   {
      "name" : "mycomp2a",
      "size" : 608,
      "type" : "f"
   },
   {
      "name" : "mycomp2b",
      "size" : 686,
      "type" : "f"
   },
   {
      "name" : "mycomp2b+comp",
      "size" : 1394,
      "type" : "f"
   },
   {
      "name" : "pause",
      "size" : 4096,
      "type" : "d"
   },
   {
      "name" : "perl-App-hello",
      "size" : 4096,
      "type" : "d"
   }
]

And the function is available from Perl as well, it’s just a regular Perl subroutine. You can put it in a module and publish the module on CPAN, and so on. There are tools to publish your function as PSGI/Plack-based HTTP API as well, which we will cover in a future blog post.

Other readings of interest:

pericmd 038: Getopt::Long::Complete

Getopt::Long::Complete is a module which I created as a drop-in replacement for Getopt::Long. It lets you use the tab completion features like in Perinci::CmdLine, without you having to get into all the other concepts of Perinci::CmdLine (like Rinci metadata and Riap URL, output formatting rules, or even subcommands, and so on). It’s perfect if you want to add tab completion feature for your CLI application but you use Getopt::Long.

I personally use this module to write tab completer for other applications (non-Perl, non-Perinci::CmdLine-based). Some examples: App::ShellCompleter::cpanm (for Miyagawa’s cpanm), App::ShellCompleter::emacs (for the Emacs editor), App::ShellCompleter::CpanUpload (for RJBS’ cpan-upload),

Also you might remember from a previous blog post (pericmd 024) about Getopt::Long::Subcommand. This module also lets you use all the Complete::* modules without getting into the whole Perinci::CmdLine.

An example on how you might use Getopt::Long::Complete can be seen here (reproduced below sans the POD). It’s a source code of _cpanm, which you install on bash using complete -C _cpanm cpanm.

#!perl
 
our $DATE = '2015-02-15'; # DATE
our $VERSION = '0.10'; # VERSION
 
# NO_PERINCI_CMDLINE_SCRIPT
# FRAGMENT id=shcompgen-hint completer=1 for=cpanm
 
use 5.010001;
use strict;
use warnings;
use Log::Any '$log';
 
use Complete::Util qw(complete_array_elem complete_file combine_answers);
use Getopt::Long::Complete qw(GetOptionsWithCompletion);
 
die "This script is for shell completion only\n"
    unless $ENV{COMP_LINE} || $ENV{COMMAND_LINE};
 
my $noop = sub {};
 
# complete with list of installed modules
my $comp_installed_mods = sub {
    require Complete::Module;
 
    my %args = @_;
 
    $log->tracef("Adding completion: installed modules");
    Complete::Module::complete_module(
        word => $args{word},
    );
};
 
# complete with installable stuff
my $comp_installable = sub {
    require Complete::Module;
 
    my %args = @_;
    my $word   = $args{word} // '';
    my $mirror = $args{mirror}; # XXX support multiple mirrors
 
    # if user already types something that looks like a path instead of module
    # name, like '../' or perhaps 'C:\' (windows) then don't bother to complete
    # with module name because it will just delay things without getting any
    # result.
    my $looks_like_completing_module =
        $word eq '' || $word =~ /\A(\w+)(::\w+)*/;
 
    my @answers;
 
    {
        $log->tracef("Adding completion: tarballs & dirs");
        my $answer = complete_file(
            filter => sub { /\.(zip|tar\.gz|tar\.bz2)$/i || (-d $_) },
            word   => $word,
        );
        $log->tracef("  answer: %s", {words=>$answer, path_sep=>'/'});
        push @answers, $answer;
    }
 
    if ($looks_like_completing_module) {
        $log->tracef("Adding completion: installed modules ".
                         "(e.g. when upgrading)");
        my $answer = Complete::Module::complete_module(
            word   => $word,
        );
        $log->tracef("  answer: %s", $answer);
        push @answers, $answer;
    }
 
    # currently we only complete from local CPAN (App::lcpan) if it's available.
    # for remote service, ideally we will need a remote service that quickly
    # returns list of matching PAUSE ids, package/module names, and dist names
    # (CPANDB, XPAN::Query, and MetaCPAN::Client are not ideal because the
    # response time is not conveniently quick enough). i probably will need to
    # setup such completion-oriented web service myself. stay tuned.
    {
        no warnings 'once';
 
        last unless $looks_like_completing_module;
        eval { require App::lcpan }; last if $@;
        $log->tracef("Adding completion: modules from local CPAN mirror");
 
        require Perinci::CmdLine::Util::Config;
 
        my %lcpanargs;
        my $res = Perinci::CmdLine::Util::Config::read_config(
            program_name => "lcpan",
        );
        unless ($res->[0] == 200) {
            $log->tracef("Can't get config for lcpan: %s", $res);
            last;
        }
        my $config = $res->[2];
 
        $res = Perinci::CmdLine::Util::Config::get_args_from_config(
            config => $config,
            args   => \%lcpanargs,
            subcommand_name => 'update-index',
            meta   => $App::lcpan::SPEC{update_local_cpan_index},
        );
        unless ($res->[0] == 200) {
            $log->tracef("Can't get args from config: %s", $res);
            last;
        }
        App::lcpan::_set_args_default(\%lcpanargs);
        my $mods = App::lcpan::list_local_cpan_modules(
            %lcpanargs,
            query => $word . '%',
        );
        #$log->tracef("all mods: %s", $mods);
        my $answer = [grep {
                if ($word =~ /::\z/) {
                    /\A\Q$word\E[^:]+(::)?\z/i
                } else {
                    /\A\Q$word\E[^:]*(::)?\z/i
                }
        } @$mods];
        $log->tracef("  answer: %s", $answer);
        push @answers, $answer;
    }
 
    # TODO module name can be suffixed with '@<version>'
 
    combine_answers(@answers);
};
 
my $comp_file = sub {
    my %args = @_;
 
    complete_file(
        word => $args{word},
        ci   => 1,
    );
};
 
# this is taken from App::cpanminus::script and should be updated from time to
# time.
GetOptionsWithCompletion(
    sub {
        my %args  = @_;
        my $type      = $args{type};
        my $word      = $args{word};
        if ($type eq 'arg') {
            $log->tracef("Completing arg");
            my $seen_opts = $args{seen_opts};
            if ($seen_opts->{'--uninstall'} || $seen_opts->{'--reinstall'}) {
                return $comp_installed_mods->(word=>$word);
            } else {
                return $comp_installable->(
                    word=>$word, mirror=>$seen_opts->{'--mirror'});
            }
        } elsif ($type eq 'optval') {
            my $ospec = $args{ospec};
            my $opt   = $args{opt};
            $log->tracef("Completing optval (opt=$opt)");
            if ($ospec eq 'l|local-lib=s' ||
                    $ospec eq 'L|local-lib-contained=s') {
                return complete_file(filter=>'d', word=>$word);
            } elsif ($ospec eq 'format=s') {
                return complete_array_elem(
                    array=>[qw/tree json yaml dists/], word=>$word);
            } elsif ($ospec eq 'cpanfile=s') {
                return complete_file(word=>$word);
            }
        }
        return [];
    },
    'f|force'   => $noop,
    'n|notest!' => $noop,
    'test-only' => $noop,
    'S|sudo!'   => $noop,
    'v|verbose' => $noop,
    'verify!'   => $noop,
    'q|quiet!'  => $noop,
    'h|help'    => $noop,
    'V|version' => $noop,
    'perl=s'          => $noop,
    'l|local-lib=s'   => $noop,
    'L|local-lib-contained=s' => $noop,
    'self-contained!' => $noop,
    'mirror=s@'       => $noop,
    'mirror-only!'    => $noop,
    'mirror-index=s'  => $noop,
    'cpanmetadb=s'    => $noop,
    'cascade-search!' => $noop,
    'prompt!'         => $noop,
    'installdeps'     => $noop,
    'skip-installed!' => $noop,
    'skip-satisfied!' => $noop,
    'reinstall'       => $noop,
    'interactive!'    => $noop,
    'i|install'       => $noop,
    'info'            => $noop,
    'look'            => $noop,
    'U|uninstall'     => $noop,
    'self-upgrade'    => $noop,
    'uninst-shadows!' => $noop,
    'lwp!'    => $noop,
    'wget!'   => $noop,
    'curl!'   => $noop,
    'auto-cleanup=s' => $noop,
    'man-pages!' => $noop,
    'scandeps'   => $noop,
    'showdeps'   => $noop,
    'format=s'   => $noop,
    'save-dists=s' => $noop,
    'skip-configure!' => $noop,
    'dev!'       => $noop,
    'metacpan!'  => $noop,
    'report-perl-version!' => $noop,
    'configure-timeout=i' => $noop,
    'build-timeout=i' => $noop,
    'test-timeout=i' => $noop,
    'with-develop' => $noop,
    'without-develop' => $noop,
    'with-feature=s' => $noop,
    'without-feature=s' => $noop,
    'with-all-features' => $noop,
    'pp|pureperl!' => $noop,
    "cpanfile=s" => $noop,
    #$self->install_type_handlers,
    #$self->build_args_handlers,
);
 
# ABSTRACT: Shell completer for cpanm
# PODNAME: _cpanm
 
__END__

In the linked source code you’ll see the completion routine passed in as the first argument for the GetOptionsWithCompletion() function (line 140). This is very much like a completion routine you set in completion property of function argument specification in a Rinci metadata. The routine should accept a hash argument, with the usual keys like word and is expected to return an array or a hash.

One key that is present, passed by Getopt::Long::Complete to the completion routine is the type key, which can have possible values of “optval” (meaning we are completing the value for a command-line option) or “arg” (meaning we are completing the value of a command-line argument).

When type is “optval”, these keys are also passed: ospec (a Getopt::Long option specification to identify the option, e.g. “–foo”), opt the name of the option). There is also seen_opts which is a hash containing all the options that have been specified in the command-line.

When type is “arg”, these keys are also passed:pos (an integer starting from 0 to let us know which argument are we completing the value for).

Also you’ll see a new function being used from Complete::Util: combine_answers() (line 125). This function is used to combine two or more completion answers. Each answer can be an array or a hash. If all answers are arrays, the final result will still be an array, but if one of the input answers is a hash, the final result will be a hash.