Sun, 25 Jul 2010

Currying


Permanent link

NAME

"Perl 5 to 6" Lesson 28 - Currying

SYNOPSIS

  use v6;
  
  my &f := &substr.assuming('Hello, World');
  say f(0, 2);                # He
  say f(3, 2);                # lo
  say f(7);                   # World
  
  say <a b c>.map: * x 2;     # aabbcc
  say <a b c>.map: *.uc;      # ABC
  for ^10 {
      print <R G B>.[$_ % *]; # RGBRGBRGBR
  }

DESCRIPTION

Currying or partial application is the process of generating a function from another function or method by providing only some of the arguments. This is useful for saving typing, and when you want to pass a callback to another function.

Suppose you want a function that lets you extract substrings from "Hello, World" easily. The classical way of doing that is writing your own function:

  sub f(*@a) {
      substr('Hello, World', |@a)
  }

Currying with assuming

Perl 6 provides a method assuming on code objects, which applies the arguments passed to it to the invocant, and returns the partially applied function.

  my &f := &substr.assuming('Hello, World');

Now f(1, 2) is the same as substr('Hello, World', 1, 2).

assuming also works on operators, because operators are just subroutines with weird names. To get a subroutine that adds 2 to whatever number gets passed to it, you could write

  my &add_two := &infix:<+>.assuming(2);

But that's tedious to write, so there's another option.

Currying with the Whatever-Star

  my &add_two := * + 2;
  say add_two(4);         # 6

The asterisk, called Whatever, is a placeholder for an argument, so the whole expression returns a closure. Multiple Whatevers are allowed in a single expression, and create a closure that expects more arguments, by replacing each term * by a formal parameter. So * * 5 + * is equivalent to -> $a, $b { $a * 5 + $b }.

  my $c = * * 5 + *;
  say $c(10, 2);                # 52

Note that the second * is an infix operator, not a term, so it is not subject to Whatever-currying.

The process of lifting an expression with Whatever stars into a closure is driven by syntax, and done at compile time. This means that

  my $star = *;
  my $code = $star + 2

does not construct a closure, but instead dies with a message like

  Can't take numeric value for object of type Whatever

Whatever currying is more versatile than .assuming, because it allows to curry something else than the first argument very easily:

  say  ~(1, 3).map: 'hi' x *    # hi hihihi

This curries the second argument of the string repetition operator infix x, so it returns a closure that, when called with a numeric argument, produces the string hi as often as that argument specifies.

The invocant of a method call can also be Whatever star, so

  say <a b c>.map: *.uc;      # ABC

involves a closure that calls the uc method on its argument.

MOTIVATION

Perl 5 could be used for functional programming, which has been demonstrated in Mark Jason Dominus' book Higher Order Perl.

Perl 6 strives to make it even easier, and thus provides tools to make typical constructs in functional programming easily available. Currying and easy construction of closures is a key to functional programming, and makes it very easy to write transformation for your data, for example together with map or grep.

SEE ALSO

http://perlcabal.org/syn/S02.html#Built-In_Data_Types

http://hop.perl.plover.com/

http://en.wikipedia.org/wiki/Currying

[/perl-5-to-6] Permanent link

comments / trackbacks

Thu, 22 Jul 2010

Common Perl 6 data processing idioms


Permanent link

NAME

"Perl 5 to 6" Lesson 27 - Common Perl 6 data processing idioms

SYNOPSIS

  # create a hash from a list of keys and values:
  # solution 1: slices
  my %hash; %hash{@keys} = @values;
  # solution 2: meta operators
  my %hash = @keys Z=> @values;

  # create a hash from an array, with
  # true value for each array item:
  my %exists = @keys Z=> 1 xx *;

  # limit a value to a given range, here 0..10.
  my $x = -2;
  say 0 max $x min 10;

  # for debugging: dump the contents of a variable,
  # including its name, to STDERR
  note :$x.perl;

  # sort case-insensitively
  say @list.sort: *.lc;

  # mandatory attributes
  class Something {
      has $.required = die "Attribute 'required' is mandatory";
  }
  Something.new(required => 2); # no error
  Something.new()               # BOOM

DESCRIPTION

Learning the specification of a language is not enough to be productive with it. Rather you need to know how to solve specific problems. Common usage patterns, called idioms, helps you not having to re-invent the wheel every time you're faced with a problem.

So here a some common Perl 6 idioms, dealing with data structures.

Hashes

  # create a hash from a list of keys and values:
  # solution 1: slices
  my %hash; %hash{@keys} = @values;
  # solution 2: meta operators
  my %hash = @keys Z=> @values;

The first solution is the same you'd use in Perl 5: assignment to a slice. The second solution uses the zip operator Z, which joins to list like a zip fastener: 1, 2, 3 Z 10, 20, 30 is 1, 10, 2, 20, 3, 30. The Z=> is a meta operator, which combines zip with => (the Pair construction operator). So 1, 2, 3 Z=> 10, 20, 30 evaluates to 1 => 10, 2 => 20, 3 => 30. Assignment to a hash variable turns that into a Hash.

For existence checks, the values in a hash often doesn't matter, as long as they all evaluate to True in boolean context. In that case, a nice way to initialize the hash from a given array or list of keys is

  my %exists = @keys Z=> 1 xx *;

which uses a lazy, infinite list of 1s on the right-hand side, and relies on the fact that Z ends when the shorter list is exhausted.

Numbers

Sometimes you want to get a number from somewhere, but clip it into a predefined range (for example so that it can act as an array index).

In Perl 5 you often end up with things like $a = $b > $upper ? $upper : $b, and another conditional for the lower limit. With the max and min infix operators, that simplifies considerably to

  my $in-range = $lower max $x min $upper;

because $lower max $x returns the larger of the two numbers, and thus clipping to the lower end of the range.

Since min and max are infix operators, you can also clip infix:

 $x max= 0;
 $x min= 10;

Debugging

Perl 5 has Data::Dumper, Perl 6 objects have the .perl method. Both generate code that reproduces the original data structure as faithfully as possible.

:$var generates a Pair ("colonpair"), using the variable name as key (but with sigil stripped). So it's the same as var => $var. note() writes to the standard error stream, appending a newline. So note :$var.perl is quick way of obtaining the value of a variable for debugging; purposes, along with its name.

Sorting

Like in Perl 5, the sort built-in can take a function that compares two values, and then sorts according to that comparison. Unlike Perl 5, it's a bit smarter, and automatically does a transformation for you if the function takes only one argument.

In general, if you want to compare by a transformed value, in Perl 5 you can do:

    # WARNING: Perl 5 code ahead
    my @sorted = sort { transform($a) cmp transform($b) } @values;

    # or the so-called Schwartzian Transform:
    my @sorted = map { $_->[1] }
                 sort { $a->[0] cmp $b->[0] }
                 map { [transform($_), $_] }
                 @values

The former solution requires repetitive typing of the transformation, and executes it for each comparison. The second solution avoids that by storing the transformed value along with the original value, but it's quite a bit of code to write.

Perl 6 automates the second solution (and a bit more efficient than the naiive Schwartzian transform, by avoiding an array for each value) when the transformation function has arity one, ie accepts one argument only:

    my @sorted = sort &transform, @values;

Mandatory Attributes

The typical way to enforce the presence of an attribute is to check its presence in the constructor - or in all constructors, if there are many.

That works in Perl 6 too, but it's easier and safer to require the presence at the level of each attribute:

    has $.attr = die "'attr' is mandatory";

This exploits the default value mechanism. When a value is supplied, the code for generating the default value is never executed, and the die never triggers. If any constructor fails to set it, an exception is thrown.

MOTIVATION

N/A

[/perl-5-to-6] Permanent link

comments / trackbacks

Thu, 09 Jul 2009

Exceptions and control exceptions


Permanent link

NAME

"Perl 5 to 6" Lesson 26 - Exceptions and control exceptions

SYNOPSIS

    try {
        die "OH NOEZ";

        CATCH { 
            say "there was an error: $!";
        }
    }

DESCRIPTION

Exceptions are, contrary to their name, nothing exceptional. In fact they are part of the normal control flow of programs in Perl 6.

Exceptions are generated either by implicit errors (for example dividing by zero, calling a non-existing method, type check failures) or by explicitly calling die or other functions.

When an exception is thrown, the program searches for CATCH statements or try blocks in the caller frames, unwinding the stack all the way (that means it forcibly returns from all routines called so far). If no CATCH or try is found, the program terminates, and prints out a hopefully helpful error message. If one was found, the error message is stored in the special variable $!, and the CATCH block is executed (or in the case of a try without a CATCH block the try block returns undef).

So far exceptions might still sound exceptional, but error handling is integral part of each non-trivial application. But even more, normal return statements also throw exceptions!

They are called control exceptions, and can be caught with CONTROL blocks, or are implicitly caught at each routine declaration.

Consider this example:

    use v6;
    my $block = -> { return "block"; say "still here" };

    sub s {
        $block.();
        return "sub";
    }

    say s();

Here the return "block" throws a control exception, causing it to not only exit the current block (and thus not printing still here on the screen), but also exiting the subroutine, where it is caught by the sub s... declaration. The payload, here a string, is handed back as the return value, and the say in the last line prints it to the screen.

Embedding the call $block.() in a try { ... } block or adding a CONTROL { ... } block to the body of the routine causes it to catch the exception.

Contrary to what other programming languages do, the CATCH/CONTROL blocks are within the scope in which the error is caught (not on the outside), giving it full access to the lexical variables, which makes it easier to generate useful error message, and also prevents DESTROY blocks from being run before the error is handled.

Unthrown exceptions

Perl 6 embraces the idea of multi threading, and in particular automated parallelization. To make sure that not all threads suffer from the termination of a single thread, a kind of "soft" exception was invented.

When a function calls fail($obj), it returns a special value of undef, which contains the payload $obj (usually an error message) and the back trace (file name and line number). Processing that special undefined value without check if it's undefined causes a normal exception to be thrown.

    my @files = </etc/passwd /etc/shadow nonexisting>;
    my @handles = hyper map { open($_) }, @files;

In this example the hyper operator tells map to parallelize its actions as far as possible. When the opening of the nonexisting file fails, an ordinary die "No such file or directory" would also abort the execution of all other open operations. But since a failed open calls fail("No such file or directory" instead, it gives the caller the possibility to check the contents of @handles, and it still has access to the full error message.

If you don't like soft exceptions, you say use fatal; at the start of the program and cause all exceptions from fail() to be thrown immediately.

MOTIVATION

A good programming language needs exceptions to handle error conditions. Always checking return values for success is a plague and easily forgotten.

Since traditional exceptions can be poisonous for implicit parallelism, we needed a solution that combined the best of both worlds: not killing everything at once, and still not losing any information.

[/perl-5-to-6] Permanent link

comments / trackbacks

Wed, 27 May 2009

The Cross Meta Operator


Permanent link

NAME

"Perl 5 to 6" Lesson 25 - The Cross Meta Operator

SYNOPSIS

    for <a b> X 1..3 -> $a, $b {
        print "$a: $b   ";
    }
    # output: a: 1  a: 2  a: 3  b: 1  b: 2  b: 3

    .say for <a b c> X 1, 2;
    # output: a1\n a2\n b1\n b2\n c1\n c2\n
    # (with real newlines instead of \n)

DESCRIPTION

The cross operator X returns the Cartesian product of two or more lists, which means that it returns all possible tuples where the first item is an item of the first list, the second item is an item of second list etc.

If an operator follows the X, then this operator is applied to all tuple items, and the result is returned instead. So 1, 2 X+ 3, 6 will return the values 1+3, 1+6, 2+3, 2+6 (evaluated as 4, 7, 5, 8 of course).

MOTIVATION

It's quite common that one has to iterate over all possible combinations of two or more lists, and the cross operator can condense that into a single iteration, thus simplifying programs and using up one less indentation level.

The usage as a meta operator can sometimes eliminate the loops altogether.

SEE ALSO

http://perlcabal.org/syn/S03.html#Cross_operators,

[/perl-5-to-6] Permanent link

comments / trackbacks

Wed, 10 Dec 2008

The Reduction Meta Operator


Permanent link

NAME

"Perl 5 to 6" Lesson 24 - The Reduction Meta Operator

SYNOPSIS

    say [+] 1, 2, 3;    # 6
    say [+] ();         # 0
    say [~] <a b>;      # ab
    say [**] 2, 3, 4;   # 2417851639229258349412352

    [\+] 1, 2, 3, 4     # 1, 3, 6, 10
    [\**] 2, 3, 4       # 4, 81, 2417851639229258349412352

    if [<=] @list {
        say "ascending order";
    }

Description

The reduction meta operator [...] can enclose any associative infix operator, and turn it into a list operator. This happens as if the operator was just put between the items of the list, so [op] $i1, $i2, @rest returns the same result as if it was written as $i1 op $i2 op @rest[0] op @rest[1] ....

This is a very powerful construct that promotes the plus + operator into a sum function, ~ into a join (with empty separator) and so on. It is somewhat similar to the List.reduce function, and if you had some exposure to functional programming, you'll probably know about foldl and foldr (in Lisp or Haskell). Unlike those [...] respects the associativity of the enclosed operator, so [/] 1, 2, 3 is interpreted as (1 / 2) / 3 (left associative), [**] 1, 2, 3 is handled correctly as 1 ** (2**3) (right associative).

Like all other operators whitespaces are forbidden, so you while you can write [+], you can't say [ + ].

Since comparison operators can be chained, you can also write things like

    if    [==] @nums { say "all nums in @nums are the same" }
    elsif [<]  @nums { say "@nums is in strict ascending order" }
    elsif [<=] @nums { say "@nums is in ascending order"}

You can even reduce the assignment operator:

    my @a = 1..3;
    [=] @a, 4;          # same as @a[0] = @a[1] = @a[2] = 4;

Note that [...] always returns a scalar, so [,] @list is actually the same as [@list].

Getting partial results

There's a special form of this operator that uses a backslash like this: [\+]. It returns a list of the partial evaluation results. So [\+] 1..3 returns the list 1, 1+2, 1+2+3, which is of course 1, 3, 6.

    [\~] 'a' .. 'd'     # <a ab abc abcd>

Since right-associative operators evaluate from right to left, you also get the partial results that way:

    [\**] 1..3;         # 3, 2**3, 1**(2**3), which is 3, 8, 1

Multiple reduction operators can be combined:

    [~] [\**] 1..3;     # "381"

MOTIVATION

Programmers are lazy, and don't want to write a loop just to apply a binary operator to all elements of a list. List.reduce does something similar, but it's not as terse as the meta operator ([+] @list would be @list.reduce(&infix:<+>)), and takes care of the associativity of the operator.

If you're not convinced, play a bit with it (pugs mostly implements it), it's real fun.

SEE ALSO

http://perlcabal.org/syn/S03.html#Reduction_operators, http://www.perlmonks.org/?node_id=716497

[/perl-5-to-6] Permanent link

comments / trackbacks

Tue, 09 Dec 2008

Quoting and Parsing


Permanent link

NAME

"Perl 5 to 6" Lesson 23 - Quoting and Parsing

SYNOPSIS

    my @animals = <dog cat tiger>
    # or
    my @animals = qw/dog cat tiger/;
    # or 
    
    my $interface = q{eth0};
    my $ips = q :s :x /ifconfig $interface/;

    # -----------

    sub if {
        warn "if() calls a sub\n";
    }
    if();

DESCRIPTION

Quoting

Perl 6 has a powerful mechanism of quoting strings, you have exact control over what features you want in your string.

Perl 5 had single quotes, double quotes and qw(...) (single quotes, splitted on whitespaces) as well as the q(..) and qq(...) forms which are basically synonyms for single and double quotes.

Perl 6 in turn defines a quote operator named Q that can take various modifiers. The :b (backslash) modifier allows interpolation of backslash escape sequences like \n, the :s modifier allows interpolation of scalar variables, :c allows the interpolation of closures ("1 + 2 = { 1 + 2 }") and so on, :w splits on words as qw/.../ does.

You can arbitrarily combine those modifiers. For example you might wish a form of qw/../ that interpolates only scalars, but nothing else? No problem:

    my $stuff = "honey";
    my @list = Q :w :s/milk toast $stuff with\tfunny\nescapes/;
    say @list[*-1];                     # prints with\nfunny\nescapes

Here's a list of what modifiers are available, mostly stolen from S02 directly. All of these also have long names, which I omitted here.

    Features:
        :q          Interpolate \\, \q and \'
        :b          Other backslash escape sequences like \n, \t
    Operations:
        :x          Execute as shell command, return result
        :w          Split on whitespaces
        :ww         Split on whitespaces, with quote protection
    Variable interpolation
        :s          Interpolate scalars   ($stuff)
        :a          Interpolate arrays    (@stuff[])
        :h          Interpolate hashes    (%stuff{})
        :f          Interpolate functions (&stuff())
    Other
        :c          Interpolate closures  ({code})
        :qq         Interpolate with :s, :a, :h, :f, :c, :b
        :regex      parse as regex

There are some short forms which make life easier for you:

    q       Q:q
    qq      Q:qq
    m       Q:regex

You can also omit the first colon : if the quoting symbol is a short form, and write it as a singe word:

    symbol      short for
    qw          q:w
    Qw          Q:w
    qx          q:x
    Qc          Q:c
    # and so on.

However there is one form that does not work, and some Perl 5 programmers will miss it: you can't write qw(...) with the round parenthesis in Perl 6. It is interpreted as a call to sub qw.

Parsing

This is where parsing comes into play: Every construct of the form identifier(...) is parsed as sub call. Yes, every.

    if($x<3)

is parsed as a call to sub if. You can disambiguate with whitespaces:

    if ($x < 3) { say '<3' }

Or just omit the parens altogether:

    if $x < 3 { say '<3' }

This implies that Perl 6 has no keywords. Actually there are keywords like use or if, but they are only special if used in a special syntax.

MOTIVATION

Various combinations of the quoting modifiers are already used internally, for example q:w to parse <...>, and :regex for m/.../. It makes sense to expose these also to the user, who gains flexibility, and can very easily write macros that provide a shortcut for the exact quoting semantics he wants.

And when you limit the specialty of keywords, you have far less troubles with backwards compatibility if you want to change what you consider a "keyword".

SEE ALSO

http://perlcabal.org/syn/S02.html#Literals

[/perl-5-to-6] Permanent link

comments / trackbacks

Mon, 08 Dec 2008

The State of the implementations


Permanent link

NAME

"Perl 5 to 6" Lesson 22 - The State of the implementations

SYNOPSIS

    (none)

DESCRIPTION

Note: This lesson is long outdated, and preserved for historical interest only. The best way to stay informed about various Perl 6 compilers is to follow the blogs at http://planetsix.perl.org/.

Perl 6 is a language specification, and multiple compilers are being written that aim to implement Perl 6, and partially they already do.

Pugs

Pugs is a Perl 6 compiler written in Haskell. It was started by Audrey Tang, and she also did most of the work. In terms of implemented features it might still be the most advanced implementation today (May 2009).

To build and test pugs, you have to install GHC 6.10.1 first, and then run

    svn co http://svn.pugscode.org/pugs
    cd pugs
    perl Makefile.PL
    make
    make test

That will install some Haskell dependencies locally and then build pugs. For make test you might need to install some Perl 5 modules, which you can do with cpan Task::Smoke.

Pugs hasn't been developed during the last three years, except occasional clean-ups of the build system.

Since the specification is evolving and Pugs is not updated, it is slowly drifting into obsoleteness.

Pugs can parse most common constructs, implements object orientation, basic regexes, nearly(?) all control structures, basic user defined operators and macros, many builtins, contexts (except slice context), junctions, basic multi dispatch and the reduction meta operator - based on the syntax of three years past.

Rakudo

Rakudo is a parrot based compiler for Perl 6. The main architect is Patrick Michaud, many features were implemented by Jonathan Worthington.

It is hosted on github, you can find build instructions on http://rakudo.org/how-to-get-rakudo.

Rakudo development is very active, it's the most active Perl 6 compiler today. It passes a bit more than 17,000 tests from the official test suite (July 2009).

It implements most control structures, most syntaxes for number literals, interpolation of scalars and closures, chained operators, BEGIN- and END blocks, pointy blocks, named, optional and slurpy arguments, sophisticated multi dispatch, large parts of the object system, regexes and grammars, Junctions, generic types, parametric roles, typed arrays and hashes, importing and exporting of subroutines and basic meta operators.

If you want to experiment with Perl 6 today, Rakudo is the recommended choice.

Elf

Mitchell Charity started elf, a bootstrapping compiler written in Perl 6, with a grammar written in Ruby. Currently it has a Perl 5 backend, others are in planning.

It lives in the pugs repository, once you've checked it out you can go to misc/elf/ and run ./elf_f $filename. You'll need ruby-1.9 and some perl modules, about which elf will complain bitterly when they are not present.

elf is developed in bursts of activity followed by weeks of low activity, or even none at all.

It parses more than 70% of the test suite, but implements mostly features that are easy to emulate with Perl 5, and passes about 700 tests from the test suite.

KindaPerl6

Flavio Glock started KindaPerl6 (short kp6), a mostly bootstrapped Perl 6 compiler. Since the bootstrapped version is much too slow to be fun to develop with, it is now waiting for a faster backend.

Kp6 implements object orientation, grammars and a few distinct features like lazy gather/take. It also implements BEGIN blocks, which was one of the design goals.

v6.pm

v6 is a source filter for Perl 5. It was written by Flavio Glock, and supports basic Perl 6 plus grammars. It is fairly stable and fast, and is occasionally enhanced. It lives on the CPAN and in the pugs repository in perl5/*/.

SMOP

Smop stands for Simple Meta Object Programming and doesn't plan to implement all of Perl 6, it is designed as a backend (a little bit like parrot, but very different in both design and feature set). Unlike the other implementations it aims explicitly at implementing Perl 6's powerful meta object programming facilities, ie the ability to plug in different object systems.

It is implemented in C and various domain specific languages. It was designed and implemented by Daniel Ruoso, with help from Yuval Kogman (design) and Paweł Murias (implementation, DSLs). A grant from The Perl Foundation supports its development, and it currently approaches the stage where one could begin to emit code for it from another compiler.

It will then be used as a backend for either elf or kp6, and perhaps also for pugs.

STD.pm

Larry Wall wrote a grammar for Perl 6 in Perl 6. He also wrote a cheating script named gimme5, which translates that grammar to Perl 5. It can parse about every written and valid piece of Perl 6 that we know of, including the whole test suite (apart from a few failures now and then when Larry accidentally broke something).

STD.pm lives in the pugs repository, and can be run and tested with perl-5.10.0 installed in /usr/local/bin/perl and a few perl modules (like YAML::XS and Moose):

    cd src/perl6/
    make
    make testt      # warning: takes lot of time, 80 minutes or so
    ./tryfile $your_file

It correctly parses custom operators and warns about non-existent subs, undeclared variables and multiple declarations of the same variable as well as about some Perl 5isms.

MOTIVATION

Many people ask why we need so many different implementations, and if it wouldn't be better to focus on one instead.

There are basically three answers to that.

Firstly that's not how programming by volunteers work. People sometimes either want to start something with the tools they like, or they think that one aspect of Perl 6 is not sufficiently honoured by the design of the existing implementations. Then they start a new project.

The second possible answer is that the projects explore different areas of the vast Perl 6 language: SMOP explores meta object programming (from which Rakudo will also benefit), Rakudo and parrot care a lot about efficient language interoperability, grammars and platform independence, kp6 explored BEGIN blocks, and pugs was the first implementation to explore the syntax, and many parts of the language for the first time.

The third answer is that we don't want a single point of failure. If we had just one implementation, and had severe problems with one of them for unforeseeable reasons (technical, legal, personal, ...) we have possible fallbacks.

SEE ALSO

Pugs: http://www.pugscode.org/, http://pugs.blogs.com/pugs/2008/07/pugshs-is-back.html, http://pugs.blogspot.com, source: http://svn.pugscode.org/pugs.

Rakudo: http://rakudo.org/, http://www.parrot.org/,

Elf: http://perl.net.au/wiki/Elf source: see pugs, misc/elf/.

KindaPerl6: source: see pugs, v6/v6-KindaPerl6.

v6.pm: source: see pugs, perl5/.

STD.pm: source: see pugs, src/perl6/.

[/perl-5-to-6] Permanent link

comments / trackbacks

Sun, 07 Dec 2008

Subset Types


Permanent link

NAME

"Perl 5 to 6" Lesson 21 - Subset Types

SYNOPSIS

    subset Squares of Int where { .sqrt.Int**2 == $_ };

    multi sub square_root(Squares $x --> Int) {
        return $x.sqrt.Int;
    }
    multi sub square_root(Num $x --> Num) {
        return $x.sqrt;
    }

DESCRIPTION

Java programmers tend to think of a type as either a class or an interface (which is something like a crippled class), but that view is too limited for Perl 6. A type is more generally a constraint of what a values a container can constraint. The "classical" constraint is it is an object of a class X or of a class that inherits from X. Perl 6 also has constraints like the class or the object does role Y, or this piece of code returns true for our object. The latter is the most general one, and is called a subset type:

    subset Even of Int where { $_ % 2 == 0 }
    # Even can now be used like every other type name

    my Even $x = 2;
    my Even $y = 3; # type mismatch error

(Try it out, Rakudo implements subset types).

You can also use anonymous subtypes in signatures:

    sub foo (Int where { ... } $x) { ... }
    # or with the variable at the front:
    sub foo ($x of Int where { ... } ) { ... }

MOTIVATION

Allowing arbitrary type constraints in the form of code allows ultimate extensibility: if you don't like the current type system, you can just roll your own based on subset types.

It also makes libraries easier to extend: instead of dying on data that can't be handled, the subs and methods can simply declare their types in a way that "bad" data is rejected by the multi dispatcher. If somebody wants to handle data that the previous implementation rejected as "bad", he can simple add a multi sub with the same name that accepts the data. For example a math library that handles real numbers could be enhanced this way to also handle complex numbers.

[/perl-5-to-6] Permanent link

comments / trackbacks

Sat, 06 Dec 2008

A grammar for (pseudo) XML


Permanent link

NAME

"Perl 5 to 6" Lesson 20 - A grammar for (pseudo) XML

SYNOPSIS

    grammar XML {
        token TOP   { ^ <xml> $ };
        token xml   { <text> [ <tag> <text> ]* };
        token text {  <-[<>&]>* };
        rule tag   {
            '<'(\w+) <attributes>*
            [
                | '/>'                 # a single tag
                | '>'<xml>'</' $0 '>'  # an opening and a closing tag
            ]
        };
        token attributes { \w+ '="' <-["<>]>* '"' };
    };

DESCRIPTION

So far the focus of these articles has been the Perl 6 language, independently of what has been implemented so far. To show you that it's not a purely fantasy language, and to demonstrate the power of grammars, this lesson shows the development of a grammar that parses basic XML, and that runs with Rakudo.

Please follow the instructions on http://rakudo.org/how-to-get-rakudo/ to obtain and build Rakudo, and try it out yourself.

Our idea of XML

For our purposes XML is quite simple: it consists of plain text and nested tags that can optionally have attributes. So here are few tests for what we want to parse as valid "XML", and what not:

    my @tests = (
        [1, 'abc'                       ],      # 1
        [1, '<a></a>'                   ],      # 2
        [1, '..<ab>foo</ab>dd'          ],      # 3
        [1, '<a><b>c</b></a>'           ],      # 4
        [1, '<a href="foo"><b>c</b></a>'],      # 5
        [1, '<a empty="" ><b>c</b></a>' ],      # 6
        [1, '<a><b>c</b><c></c></a>'    ],      # 7
        [0, '<'                         ],      # 8
        [0, '<a>b</b>'                  ],      # 9
        [0, '<a>b</a'                   ],      # 10
        [0, '<a>b</a href="">'          ],      # 11
        [1, '<a/>'                      ],      # 12
        [1, '<a />'                     ],      # 13
    );

    my $count = 1;
    for @tests -> $t {
        my $s = $t[1];
        my $M = XML.parse($s);
        if !($M  xor $t[0]) {
            say "ok $count - '$s'";
        } else {
            say "not ok $count - '$s'";
        }
        $count++;
    }

This is a list of both "good" and "bad" XML, and a small test script that runs these tests by calling XML.parse($string). By convention the rule that matches what the grammar should match is named TOP.

(As you can see from test 1 we don't require a single root tag, but it would be trivial to add this restriction).

Developing the grammar

The essence of XML is surely the nesting of tags, so we'll focus on the second test first. Place this at the top of the test script:

    grammar XML {
        token TOP   { ^ <tag> $ }
        token tag   {
            '<' (\w+) '>'
            '</' $0   '>'
        }
    };

Now run the script:

    $ ./perl6 xml-01.pl
    not ok 1 - 'abc'
    ok 2 - '<a></a>'
    not ok 3 - '..<ab>foo</ab>dd'
    not ok 4 - '<a><b>c</b></a>'
    not ok 5 - '<a href="foo"><b>c</b></a>'
    not ok 6 - '<a empty="" ><b>c</b></a>'
    not ok 7 - '<a><b>c</b><c></c></a>'
    ok 8 - '<'
    ok 9 - '<a>b</b>'
    ok 10 - '<a>b</a'
    ok 11 - '<a>b</a href="">'
    not ok 12 - '<a/>'
    not ok 13 - '<a />'

So this simple rule parses one pair of start tag and end tag, and correctly rejects all four examples of invalid XML.

The first test should be easy to pass as well, so let's try this:

   grammar XML {
       token TOP   { ^ <xml> $ };
       token xml   { <text> | <tag> };
       token text  { <-[<>&]>*  };
       token tag   {
           '<' (\w+) '>'
           '</' $0   '>'
       }
    };

(Remember, <-[...]> is a negated character class.)

And run it:

    $ ./perl6 xml-03.pl
    ok 1 - 'abc'
    not ok 2 - '<a></a>'
    (rest unchanged)

Why in the seven hells did the second test stop working? The answer is that Rakudo doesn't do longest token matching yet (update 2013-01: it does now), but matches sequentially. <text> matches the empty string (and thus always), so <text> | <tag> never even tries to match <tag>. Reversing the order of the two alternations would help.

But we don't just want to match either plain text or a tag anyway, but random combinations of both of them:

    token xml   { <text> [ <tag> <text> ]*  };

([...] are non-capturing groups, like (?: ... ) is in Perl 5).

And low and behold, the first two tests both pass.

The third test, ..<ab>foo</ab>dd, has text between opening and closing tag, so we have to allow that next. But not only text is allowed between tags, but arbitrary XML, so let's just call <xml> there:

    token tag   {
        '<' (\w+) '>'
        <xml>
        '</' $0   '>'
    }

    ./perl6 xml-05.pl
    ok 1 - 'abc'
    ok 2 - '<a></a>'
    ok 3 - '..<ab>foo</ab>dd'
    ok 4 - '<a><b>c</b></a>'
    not ok 5 - '<a href="foo"><b>c</b></a>'
    (rest unchanged)

We can now focus on attributes (the href="foo" stuff):

    token tag   {
        '<' (\w+) <attribute>* '>'
        <xml>
        '</' $0   '>'
    };
    token attribute {
        \w+ '="' <-["<>]>* \"
    };

But this doesn't make any new tests pass. The reason is the blank between the tag name and the attribute. Instead of adding \s+ or \s* in many places we'll switch from token to rule, which implies the :sigspace modifier:

    rule tag   {
        '<'(\w+) <attribute>* '>'
        <xml>
        '</'$0'>'
    };
    token attribute {
        \w+ '="' <-["<>]>* \"
    };

Now all tests pass, except the last two:

    ok 1 - 'abc'
    ok 2 - '<a></a>'
    ok 3 - '..<ab>foo</ab>dd'
    ok 4 - '<a><b>c</b></a>'
    ok 5 - '<a href="foo"><b>c</b></a>'
    ok 6 - '<a empty="" ><b>c</b></a>'
    ok 7 - '<a><b>c</b><c></c></a>'
    ok 8 - '<'
    ok 9 - '<a>b</b>'
    ok 10 - '<a>b</a'
    ok 11 - '<a>b</a href="">'
    not ok 12 - '<a/>'
    not ok 13 - '<a />'

These contain un-nested tags that are closed with a single slash /. No problem to add that to rule tag:

    rule tag   {
        '<'(\w+) <attribute>* [
            | '/>'
            | '>' <xml> '</'$0'>'
        ]
    };

All tests pass, we're happy, our first grammar works well.

More hacking

Playing with grammars is much more fun that reading about playing, so here's what you could implement:

  • plain text can contain entities like &amp;
  • I don't know if XML tag names are allowed to begin with a number, but the current grammar allows that. You might look it up in the XML specification, and adapt the grammar if needed.
  • plain text can contain <![CDATA[ ... ]]> blocks, in which xml-like tags are ignored and < and the like don't need to be escaped
  • Real XML allows a preamble like <?xml version="0.9" encoding="utf-8"?> and requires one root tag which contains the rest (You'd have to change some of the existing test cases)
  • You could try to implement a pretty-printer for XML by recursively walking through the match object $/. (This is non-trivial; you might have to work around a few Rakudo bugs, and maybe also introduce some new captures).

(Please don't post solutions to this as comments in this blog; let others have the same fun as you had ;-).

Have fun hacking.

MOTIVATION

It's powerful and fun

SEE ALSO

Regexes are specified in great detail in S05: http://perlcabal.org/syn/S05.html.

More working examples for grammars can be found at https://github.com/moritz/json/ (check file lib/JSON/Tiny/Grammar.pm).

[/perl-5-to-6] Permanent link

comments / trackbacks

Sun, 30 Nov 2008

Regexes strike back


Permanent link

NAME

"Perl 5 to 6" Lesson 19 - Regexes strike back

SYNOPSIS

    # normal matching:
    if 'abc' ~~ m/../ {
        say $/;                 # ab
    }

    # match with implicit :sigspace modifier
    if 'ab cd ef'  ~~ mm/ (..) ** 2 / {
        say $1;                 # cd
    }

    # substitute with the :sigspace modifier
    my $x = "abc     defg";
    $x ~~ ss/c d/x y/;
    say $x;                     # abx     yefg

DESCRIPTION

Since the basics of regexes are already covered in lesson 07, here are some useful (but not very structured) additional facts about Regexes.

Matching

You don't need to write grammars to match regexes, the traditional form m/.../ still works, and has a new brother, the mm/.../ form, which implies the :sigspace modifier. Remember, that means that whitespaces in the regex are substituted by the <.ws> rule.

The default for the rule is to match \s+ if it is surrounded by two word-characters (ie those matching those \w), and \s* otherwise.

In substitutions the :samespace modifier takes care that whitespaces matched with the ws rule are preserved. Likewise the :samecase modifier, short :ii (since it's a variant of :i) preserve case.

    my $x = 'Abcd';
    $x ~~ s:ii/^../foo/;
    say $x;                     # Foocd
    $x = 'ABC'
    $x ~~ s:ii/^../foo/;
    say $x                      # FOO

This is very useful if you want to globally rename your module Foo, to Bar, but for example in environment variables it is written as all uppercase. With the :ii modifier the case is automatically preserved.

It copies case information on a character by character. But there's also a more intelligent version; when combined with the :sigspace (short :s) modifier, it tries to find a pattern in the case information of the source string. Recognized are .lc, .uc, .lc.ucfirst, .uc.lcfirst and .lc.capitaliz (Str.capitalize uppercases the first character of each word). If such a pattern is found, it is also applied to the substitution string.

    my $x = 'The Quick Brown Fox';
    $x ~~ s :s :ii /brown.*/perl 6 developer/;
    # $x is now 'The Quick Perl 6 Developer'

Alternations

Alternations are still formed with the single bar |, but it means something else than in Perl 5. Instead of sequentially matching the alternatives and taking the first match, it now matches all alternatives in parallel, and takes the longest one.

    'aaaa' ~~ m/ a | aaa | aa /;
    say $/                          # aaa

While this might seem like a trivial change, it has far reaching consequences, and is crucial for extensible grammars. Since Perl 6 is parsed using a Perl 6 grammar, it is responsible for the fact that in ++$a the ++ is parsed as a single token, not as two prefix:<+> tokens.

The old, sequential style is still available with ||:

    grammar Math::Expression {
        token value {
            | <number>
            | '(' 
              <expression> 
              [ ')' || { fail("Parenthesis not closed") } ]
        }

        ...
    }

The { ... } execute a closure, and calling fail in that closure makes the expression fail. That branch is guaranteed to be executed only if the previous (here the ')') fails, so it can be used to emit useful error messages while parsing.

There are other ways to write alternations, for example if you "interpolate" an array, it will match as an alternation of its values:

    $_ = '12 oranges';
    my @fruits = <apple orange banana kiwi>;
    if m:i:s/ (\d+) (@fruits)s? / {
        say "You've got $0 $1s, I've got { $0 + 2 } of them. You lost.";
    }

There is yet another construct that automatically matches the longest alternation: multi regexes. They can be either written as multi token name or with a proto:

    grammar Perl {
        ...
        proto token sigil { * }
        token sigil:sym<$> { <sym> }
        token sigil:sym<@> { <sym> }
        token sigil:sym<%> { <sym> }
        ...

       token variable { <sigil> <twigil>? <identifier> }
   }

This example shows multiple tokens called sigil, which are parameterized by sym. When the short name, ie sigil is used, all of these tokens are matched in an alternation. You may think that this is a very inconvenient way to write an alternation, but it has a huge advantage over writing '$'|'@'|'%': it is easily extensible:

    grammar AddASigil is Perl {
        token sigil:sym<!> { <sym> }
    }
    # wow, we have a Perl 6 grammar with an additional sigil!

Likewise you can override existing alternatives:

    grammar WeirdSigil is Perl {
        token sigil:sym<$> { '°' }
    }

In this grammar the sigil for scalar variables is °, so whenever the grammar looks for a sigil it searches for a ° instead of a $, but the compiler will still know that it was the regex sigil:sym<$> that matched it.

In the next lesson you'll see the development of a real, working grammar with Rakudo.

[/perl-5-to-6] Permanent link

comments / trackbacks

Sat, 29 Nov 2008

Scoping


Permanent link

NAME

"Perl 5 to 6" Lesson 18 - Scoping

SYNOPSIS

    for 1 .. 10 -> $a {
        # $a visible here
    }
    # $a not visible here

    while my $b = get_stuff() {
        # $b visible here
    }
    # $b still visible here

    my $c = 5;
    {
        my $c = $c;
        # $c is undef here
    }
    # $c is 5 here

    my $y;
    my $x = $y + 2 while $y = calc();
    # $x still visible

DESCRIPTION

Lexical Scoping

Scoping in Perl 6 is quite similar to that of Perl 5. A Block introduces a new lexical scope. A variable name is searched in the innermost lexical scope first, if it's not found it is then searched for in the next outer scope and so on. Just like in Perl 5 a my variable is a proper lexical variable, and an our declaration introduces a lexical alias for a package variable.

But there are subtle differences: variables are exactly visible in the rest of the block where they are declared, variables declared in block headers (for example in the condition of a while loop) are not limited to the block afterwards.

Also Perl 6 only ever looks up unqualified names (variables and subroutines) in lexical scopes.

If you want to limit the scope, you can use formal parameters to the block:

    if calc() -> $result {
        # you can use $result here
    }
    # $result not visible here

Variables are visible immediately after they are declared, not at the end of the statement as in Perl 5.

    my $x = .... ;
            ^^^^^
            $x visible here in Perl 6
            but not in Perl 5

Dynamic scoping

The local adjective is now called temp, and if it's not followed by an initialization the previous value of that variable is used (not undef).

There's also a new kind of dynamically scoped variable called a hypothetical variable. If the block is left with an exception, then the previous value of the variable is restored. If not, it is kept.

Context variables

Some variables that are global in Perl 5 ($!, $_) are context variables in Perl 6, that is they are passed between dynamic scopes.

This solves an old Problem in Perl 5. In Perl 5 an DESTROY sub can be called at a block exit, and accidentally change the value of a global variable, for example one of the error variables:

   # Broken Perl 5 code here:
   sub DESTROY { eval { 1 }; }
   
   eval {
       my $x = bless {};
       die "Death\n";
   };
   print $@ if $@;         # No output here

In Perl 6 this problem is avoided by not implicitly using global variables.

(In Perl 5.14 there is a workaround that protects $@ from being modified, thus averting the most harm from this particular example.)

Pseudo-packages

If a variable is hidden by another lexical variable of the same name, it can be accessed with the OUTER pseudo package

    my $x = 3;
    {
        my $x = 10;
        say $x;             # 10
        say $OUTER::x;      # 3
        say OUTER::<$x>     # 3
    }

Likewise a function can access variables from its caller with the CALLER and CONTEXT pseudo packages. The difference is that CALLER only accesses the scope of the immediate caller, CONTEXT works like UNIX environment variables (and should only be used internally by the compiler for handling $_, $! and the like). To access variables from the outer dynamic scope they must be declared with is context.

MOTIVATION

It is now common knowledge that global variables are really bad, and cause lots of problems. We also have the resources to implement better scoping mechanism. Therefore global variables are only used for inherently global data (like %*ENV or $*PID).

The block scoping rules haven been greatly simplified.

Here's a quote from Perl 5's perlsyn document; we don't want similar things in Perl 6:

 NOTE: The behaviour of a "my" statement modified with a statement
 modifier conditional or loop construct (e.g. "my $x if ...") is
 undefined.  The value of the "my" variable may be "undef", any
 previously assigned value, or possibly anything else.  Don't rely on
 it.  Future versions of perl might do something different from the
 version of perl you try it out on.  Here be dragons.

SEE ALSO

S04 discusses block scoping: http://perlcabal.org/syn/S04.html.

S02 lists all pseudo packages and explains context scoping: http://perlcabal.org/syn/S02.html#Names.

[/perl-5-to-6] Permanent link

comments / trackbacks

Fri, 28 Nov 2008

Unicode


Permanent link

NAME

"Perl 5 to 6" Lesson 17 - Unicode

SYNOPSIS

  (none)    

DESCRIPTION

Perl 5's Unicode model suffers from a big weakness: it uses the same type for binary and for text data. For example if your program reads 512 bytes from a network socket, it is certainly a byte string. However when (still in Perl 5) you call uc on that string, it will be treated as text. The recommended way is to decode that string first, but when a subroutine receives a string as an argument, it can never surely know if it had been encoded or not, ie if it is to be treated as a blob or as a text.

Perl 6 on the other hand offers the type buf, which is just a collection of bytes, and Str, which is a collection of logical characters.

Logical character is still a vague term. To be more precise a Str is an object that can be viewed at different levels: Byte, Codepoint (anything that the Unicode Consortium assigned a number to is a codepoint), Grapheme (things that visually appear as a character) and CharLingua (language defined characters).

For example the string with the hex bytes 61 cc 80 consists of three bytes (obviously), but can also be viewed as being consisting of two codepoints with the names LATIN SMALL LETTER A (U+0041) and COMBINING GRAVE ACCENT (U+0300), or as one grapheme that, if neither my blog software nor your browser kill it, looks like this: à.

So you can't simply ask for the length of a string, you have to ask for a specific length:

  $str.bytes;
  $str.codes;
  $str.graphs;

There's also method named chars, which returns the length in the current Unicode level (which can be set by a pragma like use bytes, and which defaults to graphemes).

In Perl 5 you sometimes had the problem of accidentally concatenating byte strings and text strings. If you should ever suffer from that problem in Perl 6, you can easily identify where it happens by overloading the concatenation operator:

  sub GLOBAL::infix:<~> is deep (Str $a, buf $b)|(buf $b, Str $a) {
      die "Can't concatenate text string «"
          ~ $a.encode("UTF-8")
            "» with byte string «$b»\n";
  }

Encoding and Decoding

The specification of the IO system is very basic and does not yet define any encoding and decoding layers, which is why this article has no useful SYNOPSIS section. I'm sure that there will be such a mechanism, and I could imagine it will look something like this:

  my $handle = open($filename, :r, :encoding<UTF-8>);

Regexes and Unicode

Regexes can take modifiers that specify their Unicode level, so m:codes/./ will match exactly one codepoint. In the absence of such modifiers the current Unicode level will be used.

Character classes like \w (match a word character) behave accordingly to the Unicode standard. There are modifiers that ignore case (:i) and accents (:a), and modifiers for the substitution operators that can carry case information to the substitution string (:samecase and :sameaccent, short :ii, :aa).

MOTIVATION

It is quite hard to correctly process strings with most tools and most programming languages these days. Suppose you have a web application in perl 5, and you want to break long words automatically so that they don't mess up your layout. When you use naive substr to do that, you might accidentally rip graphemes apart.

Perl 6 will be the first mainstream programming language with built in support for grapheme level string manipulation, which basically removes most Unicode worries, and which (in conjunction with regexes) makes Perl 6 one of the most powerful languages for string processing.

The separate data types for text and byte strings make debugging and introspection quite easy.

SEE ALSO

http://perlcabal.org/syn/S32/Str.html

[/perl-5-to-6] Permanent link

comments / trackbacks

Thu, 27 Nov 2008

Enums


Permanent link

NAME

"Perl 5 to 6" Lesson 16 - Enums

SYNOPSIS

  enum bit Bool <False True>;
  my $value = $arbitrary_value but True;
  if $value {
      say "Yes, it's true";       # will be printed
  }

  enum Day ('Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun');
  if custom_get_date().Day == Day::Sat | Day::Sun {
      say "Weekend";
  }

DESCRIPTION

Enums are versatile beasts. They are low-level classes that consist of an enumeration of constants, typically integers or strings (but can be arbitrary).

These constants can act as subtypes, methods or normal values. They can be attached to an object with the but operator, which "mixes" the enum into the value:

  my $x = $today but Day::Tue;

You can also use the type name of the Enum as a function, and supply the value as an argument:

  $x = $today but Day($weekday);

Afterwards that object has a method with the name of the enum type, here Day:

  say $x.Day;             # 1

The value of first constant is 0, the next 1 and so on, unless you explicitly provide another value with pair notation:

  Enum Hackers (:Larry<Perl>, :Guido<Python>, :Paul<Lisp>);

You can check if a specific value was mixed in by using the versatile smart match operator, or with .does:

  if $today ~~ Day::Fri {
      say "Thank Christ it's Friday"
  }
  if $today.does(Fri) { ... }

Note that you can specify the name of the value only (like Fri) if that's unambiguous, if it's ambiguous you have to provide the full name Day::Fri.

MOTIVATION

Enums replace both the "magic" that is involved with tainted variables in Perl 5 and the return "0 but True" hack (a special case for which no warning is emitted if used as a number). Plus they give a Bool type.

Enums also provide the power and flexibility of attaching arbitrary meta data for debugging or tracing.

SEE ALSO

http://perlcabal.org/syn/S12.html#Enumerations

[/perl-5-to-6] Permanent link

comments / trackbacks

Tue, 21 Oct 2008

Twigils


Permanent link

NAME

"Perl 5 to 6" Lesson 15 - Twigils

SYNOPSIS

  class Foo {
      has $.bar;
      has $!baz;
  }

  my @stuff = sort { $^b[1] <=> $^a[1]}, [1, 2], [0, 3], [4, 8];
  my $block = { say "This is the named 'foo' parameter: $:foo" };
  $block(:foo<bar>);

  say "This is file $?FILE on line $?LINE"

  say "A CGI script" if %*ENV.exists('DOCUMENT_ROOT');

DESCRIPTION

Some variables have a second sigil, called twigil. It basically means that the variable isn't "normal", but differs in some way, for example it could be differently scoped.

You've already seen that public and private object attributes have the . and ! twigil respectively; they are not normal variables, they are tied to self.

The ^ twigil removes a special case from perl 5. To be able to write

  # beware: perl 5 code
  sort { $a <=> $b } @array

the variables $a and $b are special cased by the strict pragma. In Perl 6, there's a concept named self-declared positional parameter, and these parameters have the ^ twigil. It means that they are positional parameters of the current block, without being listed in a signature. The variables are filled in lexicographic (alphabetic) order:

  my $block = { say "$^c $^a $^b" };
  $block(1, 2, 3);                # 3 1 2

So now you can write

  @list = sort { $^b <=> $^a }, @list;
  # or:
  @list = sort { $^foo <=> $^bar }, @list;

Without any special cases.

And to keep the symmetry between positional and named arguments, the : twigil does the same for named parameters, so these lines are roughly equivalent:

  my $block = { say $:stuff }
  my $sub   = sub (:$stuff) { say $stuff }

The ? twigil stands for variables and constants that are known at compile time, like $?LINE for the current line number (formerly __LINE__), and $?DATA is the file handle to the DATA section.

Contextual variables can be accessed with the * twigil, so $*IN and $*OUT can be overridden dynamically.

A pseudo twigil is <, which is used in a construct like $<capture>, where it is a shorthand for $/<capture>, which accesses the Match object after a regex match.

MOTIVATION

When you read Perl 5's perlvar document, you can see that it has far too many variables, most of them global, that affect your program in various ways.

The twigils try to bring some order in these special variables, and at the other hand they remove the need for special cases. In the case of object attributes they shorten self.var to $.var (or @.var or whatever).

So all in all the increased "punctuation noise" actually makes the programs much more consistent and readable.

[/perl-5-to-6] Permanent link

comments / trackbacks

Mon, 20 Oct 2008

The MAIN sub


Permanent link

NAME

"Perl 5 to 6" Lesson 14 - The MAIN sub

SYNOPSIS

  # file doit.pl

  #!/usr/bin/perl6
  sub MAIN($path, :$force, :$recursive, :$home = '~/') {
      # do stuff here
  }

  # command line
  $ ./doit.pl --force --home=/home/someoneelse file_to_process

DESCRIPTION

Calling subs and running a typical Unix program from the command line is visually very similar: you can have positional, optional and named arguments.

You can benefit from it, because Perl 6 can process the command line for you, and turn it into a sub call. Your script is normally executed (at which time it can munge the command line arguments stored in @*ARGS), and then the sub MAIN is called, if it exists.

If the sub can't be called because the command line arguments don't match the formal parameters of the MAIN sub, an automatically generated usage message is printed.

Command line options map to subroutine arguments like this:

  -name                   :name
  -name=value             :name<value>

  # remember, <...> is like qw(...)
  --hackers=Larry,Damian  :hackers<Larry Damian>  

  --good_language         :good_language
  --good_lang=Perl        :good_lang<Perl>
  --bad_lang PHP          :bad_lang<PHP>

  +stuff                  :!stuff
  +stuff=healty           :stuff<healthy> but False

The $x = $obj but False means that $x is a copy of $obj, but gives Bool::False in boolean context.

So for simple (and some not quite simple) cases you don't need an external command line processor, but you can just use sub MAIN for that.

MOTIVATION

The motivation behind this should be quite obvious: it makes simple things easier, similar things similar, and in many cases reduces command line processing to a single line of code: the signature of MAIN.

SEE ALSO

http://perlcabal.org/syn/S06.html#Declaring_a_MAIN_subroutine contains the specification.

[/perl-5-to-6] Permanent link

comments / trackbacks