pericmd 002: Getopt::Long

Getopt::Long (cpanratings) is an established Perl module for parsing command-line options. A bit of history (via some googling): it has been included as part of the Perl distribution since perl-5.0.0 back in 1994. It was actually created as newgetopt.pl during the pre-module days of perl 4.x. The creator is Netherlands-based Johan Vromans (CPAN ID: JV) which also wrote the Perl Pocket Reference (one among the well-respected books, if I remember). Amazingly, after 20+ strong years, Johan still maintains Getopt::Long to this very day. So, deep kudos for that.

Why the ::Long in the name, you might ask? Because before newgetopt.pl, there is actually a Perl 4 library called getopt.pl which became Getopt::Std in Perl 5. It can only handle single-letter options, so we will not consider using it in this series.

Here’s a typical CLI program written using Getopt::Long, taken from my scripts repository:

#!/usr/bin/env perl

use 5.010;
use strict;
use warnings;

use Getopt::Long;

my %opts = (parts => ["/backup"], all=>0, percent_used=>95);

GetOptions(
    "part=s"             => $opts{parts},
    "all|a"              => \$opts{all},
    "percent-used|pct=i" => \$opts{percent_used},
    "help|h|?"           => \$opts{help},
);

if ($opts{help}) {
    print <<_;
Usage: $0 [opts]
Options:
  --part=S  Add partition S to warn, eg. --part=/home --part=/
  --all     Warn all partitions (except tmpfs)
  --percent-used=N  Warn if percentage of usage exceeds N (default: 95)

_
    exit 0;
}

my $i;
for (`df -h`) {
    next unless $i++;
    chomp;
    my ($fs, $blocks, $used, $avail, $pctuse, $mp) = split /\s+/, $_;
    $pctuse =~ s/%//;
    #say "DEBUG: mp=$mp, pctuse=$pctuse";

    next if $fs =~ /^(tmpfs)$/;
    if (!$opts{all}) { next unless $mp ~~ $opts{parts} }
    next unless $pctuse >= $opts{percent_used};
    say "Disk usage in $mp has exceeded $pctuse% (only $avail left)!";
}

As you can see, using the module is pretty straightforward: you just specify a hash of options specification and pass it to the GetOptions() function, which will search the options in @ARGV. The function will modify @ARGV and strip all the known options, leaving only unknown options and arguments.

The only thing you’ll need to familiarize with is basically the options specification. Short of a formal grammar, you might want to glance the syntax of the options spec via the regex specified in Getopt::Long::Util, copy-pasted here for convenience:

    $optspec =~ qr/\A
               (?:--)?
               (?P<name>[A-Za-z0-9_][A-Za-z0-9_-]*)
               (?P<aliases> (?: \| (?:[^:|!+=:-][^:|!+=:]*) )*)?
               (?:
                   (?P<is_neg>!) |
                   (?P<is_inc>\+) |
                   (?:
                       =
                       (?P<type>[siof])
                       (?P<desttype>|[%@])?
                       (?:
                           \{
                           (?: (?P<min_vals>\d+), )?
                           (?P<max_vals>\d+)
                           \}
                       )?
                   ) |
                   (?:
                       :
                       (?P<opttype>[siof])
                       (?P<desttype>|[%@])
                   ) |
                   (?:
                       :
                       (?P<optnum>\d+)
                       (?P<desttype>|[%@])
                   )
                   (?:
                       :
                       (?P<optplus>\+)
                       (?P<desttype>|[%@])
                   )
               )?
               \z/x
                   or return undef;

So, an option is more or less a “word” which can be followed by zero or more aliases. An alias is usually a single letter (like “a” for “all”) or a shorter version of the option (like “percent-used” vs “pct”), but can also be some non-alphanumeric characters (like “?” as alias for “help”).

If an option requires a value, you can add “=” followed by one of these letters: s (for string), i (for int), f (for float), o (nowadays seldom used?). Getopt::Long actually allows you to specify optional value, e.g. by specifying “debug:i” instead of “debug=i”, so user can specify either of these options: –debug, –debug=9 (or –debug 9).

Getopt::Long even allows you to specify multiple values, e.g. “rgb-color=i{3}” so user can specify options like: –rgb-color 255 255 0. However, you can’t do something like this (specifying different type for each value): “person-age=s,i”. I seldom use this feature, though.

There are other features of Getopt::Long, some very useful, some rarely used nowadays. We will cover this in the next posts because Perinci::CmdLine uses Getopt::Long for command-line options parsing.

Compared to other (newer) options processing modules on CPAN that reinvent the wheel, chances are Getopt::Long is still your best bet. It’s battle tested, comes with Perl 5 out of the box, and already does most of the common needs for options processing. Most of the other options processing modules don’t even do the two things that we usually take for granted when using common Unix/GNU utilities: short option bundling and auto abbreviation.

Short option bundling is a feature that lets you bundle two or more single-letter options in a single command-line argument. For example, when using rsync I usually write: rsync -avz … instead of rsync -a -v -z … The short options must not take values, except for the last one.

Auto abbreviation is a feature that lets you type only partial (the first letters of the) option names, as long as it is unambiguous. For example, if you have an option called –process-directories, you can just specify –process-dir or even –process, as long as it’s unambiguous with the other existing options. This is convenient.

In the next post I’ll write about what’s wrong with Getopt::Long.

3 thoughts on “pericmd 002: Getopt::Long

  1. I realise this post is old, but I just wanted to point out, for simple options, you don’t have to manually assign each one to a hash key, you can simply do…

    ”’
    GetOptions(
    \my %opt,
    ‘string=s’
    ‘optional:s’
    ‘verbose’,
    ‘help’ => \&help,
    );
    ”’

    now running ap with ‘-s foo -o -v’ will… set `$opt{string} = ‘foo’` and `$opt{verbose} = 1`.

    `$opt{optional}` is ‘falsy’ but can be tested with `if exists` so you might do something like… otherwise if you use ‘-o bar’ then it sets `$opt{optional} = ‘bar’` as above

    Lastly, running with ‘-h’ will call help(), so no need to cram a HEREDOC into your GetOptions.

    Like

  2. Pingback: Getopt modules 01: Getopt::Long | perlancar's blog

Leave a comment