Sat, 18 Oct 2008

Laziness

NAME

"Perl 5 to 6" Lesson 12 - Laziness

SYNOPSIS

    my @integers = 0..*;
    for @integers -> $i {
        say $i;
        last if $i % 17 == 0;
    }

    my @even := map { 2 * $_ }, 0..*;
    my @stuff := gather {
        for 0 .. Inf {
            take 2 ** $_;
        }
    }

DESCRIPTION

Perl programmers tend to be lazy. And so are their lists.

In this case lazy means, that the evaluation is delayed as much as possible. When you write something like @a := map BLOCK, @b, the block isn't executed at all. Only when you start to access items from @a the map actually executes the block and fills @a as much as needed.

Note the use of binding instead of assignment: Assigning to an array might force eager evaluation (unless the compiler knows the list is going to be infinite; the exact details of figuring this out are still subject to change), binding never does.

Laziness allows you to deal with infinite lists: as long as you don't do anything to all of its arguments, they take up only as much space as the items need that have already been evaluated.

There are pitfalls, though: determining the length of a list or sorting it kills laziness - if the list is infinite, it will likely loop infinitely, or fail early if the infiniteness can be detected.

In general conversions to a scalar (like List.join) are eager, i.e. non-lazy.

Laziness prevents unnecessary computations, and can therefore boost performance while keeping code simple. Keep in mind that there is some overhead to switching between the producing and consuming code paths.

When you read a file line by line in Perl 5, you don't use for (<HANDLE>) because it reads all the file into memory, and only then starts iterating. With laziness that's not an issue:

    my $file = open '/etc/passwd';
    for $file.lines -> $line {
        say $line;
    }

Since $file.lines is a lazy list, the lines are only physically read from disk as needed (besides buffering, of course).

gather/take

A very useful construct for creating lazy lists is gather { take }. It is used like this:

    my @list := gather {
        while True {
            # some computations;
            take $result;
        }
    }

gather BLOCK returns a lazy list. When items from @list are needed, the BLOCK is run until take is executed. take is just like return, and all taken items are used to construct @list. When more items from @list are needed, the execution of the block is resumed after take.

gather/take is dynamically scoped, so it is possible to call take outside of the lexical scope of the gather block:

    my @list = gather {
        for 1..10 {
            do_some_computation($_);
        }
    }

    sub do_some_computation($x) {
        take $x * ($x + 1);
    }

Note that gather can act on a single statement instead of a block too:

    my @list = gather for 1..10 {
        do_some_computation($_);
    }

Controlling Laziness

Laziness has its problems (and when you try to learn Haskell you'll notice how weird their IO system is because Haskell is both lazy and free of side effects), and sometimes you don't want stuff to be lazy. In this case you can just prefix it with eager.

    my @list = eager map { $block_with_side_effects }, @list;

On the other hand only lists are lazy by default.

MOTIVATION

In computer science most problems can be described with a tree of possible combinations, in which a solution is being searched for. The key to efficient algorithms is not only to find an efficient way to search, but also to construct only the interesting parts of the tree.

With lazy lists you can recursively define this tree and search in it, and it automatically constructs only these parts of the tree that you're actually using.

In general laziness makes programming easier because you don't have to know if the result of a computation will be used at all - you just make it lazy, and if it's not used the computation isn't executed at all. If it's used, you lost nothing.

Categories

Posts in this category