pericmd 039: Creating API-friendly CLI applications with parseable outputs

Traditional Unix commands present their result in a nice text output, often tabular. Examples:

% fdisk -l
Disk /dev/sda: 250.0 GB, 250059350016 bytes 255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device     Boot  Start  End    Blocks      Id  System
/dev/sda1  *     1      191    1534176     83  Linux
/dev/sda2        192    2231   16386300    83  Linux
/dev/sda3        2232   3506   10241437+   83  Linux
/dev/sda4        3507   30401  216034087+  5   Extended
/dev/sda5        3507   3767   2096451     82  Linux swap / Solaris
/dev/sda6        3768   3832   522081      83  Linux
/dev/sda7        3833   30401  213415461   83  Linux Disk

/dev/sdb: 250.0 GB, 250059350016 bytes 255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device     Boot  Start  End    Blocks      Id  System
/dev/sdb1  *     1      30401  244196001   83  Linux
% top -b
top - 13:34:07 up 23 days,  9:54, 17 users,  load average: 0.32, 0.35, 0.40
Tasks: 253 total,   1 running, 252 sleeping,   0 stopped,   0 zombie
%Cpu(s):  4.5 us,  2.6 sy, 15.5 ni, 76.4 id,  0.9 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:  12240268 total, 12015528 used,   224740 free,  1993828 buffers
KiB Swap:        0 total,        0 used,        0 free,  2537996 cached

  PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND
 6182 s1        20   0 2731m 1.4g  22m S  12.6 12.1 987:50.33 opera
 2173 s1        20   0 25196 1504 1020 R   6.3  0.0   0:00.01 top
 5500 root      20   0  440m 282m  32m S   6.3  2.4   3414:24 Xorg
 5634 s1        20   0 1117m  20m 7648 S   6.3  0.2   8:54.64 Thunar
 5646 s1        20   0  382m  12m 6520 S   6.3  0.1   0:40.34 xfdesktop
 6054 s1        20   0  912m  92m  23m S   6.3  0.8  65:20.28 konsole
23398 s2        20   0 1951m 656m  38m S   6.3  5.5  37:25.52 iceweasel
    1 root      20   0 10768  784  648 S   0.0  0.0   0:11.71 init
    2 root      20   0     0    0    0 S   0.0  0.0   0:01.01 kthreadd
...
% ls -l
total 100
-rw-r--r-- 1 s1 s1 14754 Feb 27 20:35 Changes
-rw-r--r-- 1 s1 s1  1126 Feb 27 20:35 dist.ini
drwxr-xr-x 4 s1 s1  4096 Jul 23  2014 lib/
drwxr-xr-x 4 s1 s1  4096 Feb 27 20:35 Perinci-CmdLine-Lite-0.88/
-rw-r--r-- 1 s1 s1 61148 Feb 27 20:35 Perinci-CmdLine-Lite-0.88.tar.gz
drwxr-xr-x 2 s1 s1  4096 Feb 27 21:57 t/
-rw-r--r-- 1 s1 s1    21 Aug 26  2014 weaver.ini
% df
Filesystem                                              1K-blocks       Used Available Use% Mounted on
rootfs                                                  115378728   79636360  35742368  70% /
udev                                                        10240          0     10240   0% /dev
tmpfs                                                     1224028        916   1223112   1% /run
/dev/disk/by-uuid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  115378728   79636360  35742368  70% /
tmpfs                                                        5120          0      5120   0% /run/lock
tmpfs                                                     2448040        708   2447332   1% /run/shm
/dev/sdb1                                              1922859912 1910136652  12723260 100% /backup
/dev/mapper/0                                            41284928   38735524    452252  99% /mnt
/dev/mapper/2                                            82569904   55356304  23019296  71% /backup/mnt

While easy on the human eyes, they are “hard” to parse. I quote hard because it depends on the applications: some just require a Perl’s split(/\s+/) or cut because they are a variation of tab-separated values or CSV (but remember that correctly parsing CSV is hard), while some require a custom-crafted complex regex with additional logic.

In the age where Perl’s popularity has waned and there is less emphasis on regex skills, or the age of webdevs and devops, correctly parsing CLI programs’ outputs can be quite challenging to some. Many developers are more familiar with HTTP API’s and JSON or the like, where they do not have to parse anything and get the data they want in the form of data structures (arrays, hashes, or a combination of both).

Some Unix commands do offer an option to produce more parse-friendly output, but this is rare.

With Perinci::CmdLine we get the best of both world. Your function (the backend for the CLI app) produces pure data structures. Formatting is left to the framework to figure out. When run as CLI app, your users still get a nice table output that is easy on the eyes. But when they need to parse the output or feed it to another program, or access the function via API, there is an option to produce the data structure (--json switch in CLI).

A very simple demonstration. Save the script below to list-files:

use Perinci::CmdLine::Any;

our %SPEC;

$SPEC{list_files} = {
    v => 1.1,
    args => {
        'verbose' => {
            cmdline_aliases => {v=>{}},
            schema => 'bool',
        },
        'all' => {
            cmdline_aliases => {a=>{}},
            schema => 'bool',
        },
    },
};
sub list_files {
    my %args = @_;
    my $verbose = $args{verbose};
    my $all     = $args{all};

    my @files;
    opendir my($dh), ".";
    for (sort readdir($dh)) {
        next if !$all && /\A\./;
        if ($verbose) {
            my $type = (-l $_) ? "l" : (-d $_) ? "d" : (-f _) ? "f" : "?";
            push @files, {name=>$_, size=>(-s _), type=>$type};
        } else {
            push @files, $_;
        }
    }

    [200, "OK", \@files];
}

my $app = Perinci::CmdLine::Any->new(url => '/main/list_files');
delete $app->common_opts->{verbose};
$app->common_opts->{version}{getopt} = 'version|V';
$app->run;
% ./list-files
hello
list-files
list-files~
mycomp
mycomp2a
mycomp2b
mycomp2b+comp
pause
perl-App-hello

% ./list-files -v
+----------------+------+------+
| name           | size | type |
+----------------+------+------+
| hello          | 1131 | f    |
| list-files     | 988  | f    |
| list-files~    | 989  | f    |
| mycomp         | 902  | f    |
| mycomp2a       | 608  | f    |
| mycomp2b       | 686  | f    |
| mycomp2b+comp  | 1394 | f    |
| pause          | 4096 | d    |
| perl-App-hello | 4096 | d    |
+----------------+------+------+

% ./list-files --json
[200,"OK",["hello","list-files","list-files~","mycomp","mycomp2a","mycomp2b","mycomp2b+comp","pause","perl-App-hello"],{}]

% ./list-files -v --format json-pretty --naked-res
[
   {
      "name" : "hello",
      "size" : 1131,
      "type" : "f"
   },
   {
      "name" : "list-files",
      "size" : 988,
      "type" : "f"
   },
   {
      "name" : "list-files~",
      "size" : 989,
      "type" : "f"
   },
   {
      "name" : "mycomp",
      "size" : 902,
      "type" : "f"
   },
   {
      "name" : "mycomp2a",
      "size" : 608,
      "type" : "f"
   },
   {
      "name" : "mycomp2b",
      "size" : 686,
      "type" : "f"
   },
   {
      "name" : "mycomp2b+comp",
      "size" : 1394,
      "type" : "f"
   },
   {
      "name" : "pause",
      "size" : 4096,
      "type" : "d"
   },
   {
      "name" : "perl-App-hello",
      "size" : 4096,
      "type" : "d"
   }
]

And the function is available from Perl as well, it’s just a regular Perl subroutine. You can put it in a module and publish the module on CPAN, and so on. There are tools to publish your function as PSGI/Plack-based HTTP API as well, which we will cover in a future blog post.

Other readings of interest:

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s