Posts in this category

Sun, 08 Jan 2017

Perl 6 By Example: Testing Silent Cron

This blog post is part of my ongoing project to write a book about Perl 6.

If you're interested, either in this book project or any other Perl 6 book news, please sign up for the mailing list at the bottom of the article, or here. It will be low volume (less than an email per month, on average).

The previous blog post left us with a bare-bones silent-cron implementation, but without tests. I probably sound like a broken record for bringing this up time and again, but I really want some tests when I start refactoring or extending my programs. And this time, getting the tests in is a bit harder, so I think it's worth discussing how to do it.

Refactoring

As a short reminder, this is what the program looks like:

#!/usr/bin/env perl6

sub MAIN(*@cmd, :$timeout) {
    my $proc = Proc::Async.new(|@cmd);
    my $collector = Channel.new;
    for $proc.stdout, $proc.stderr -> $supply {
        $supply.tap: { $collector.send($_) }
    }
    my $promise = $proc.start;
    my $waitfor = $promise;
    $waitfor = Promise.anyof(Promise.in($timeout), $promise)
        if $timeout;
    $ = await $waitfor;

    $collector.close;
    my $output = $collector.list.join;

    if !$timeout || $promise.status ~~ Kept {
        my $exitcode = $promise.result.exitcode;
        if $exitcode != 0 {
            say "Program @cmd[] exited with code $exitcode";
            print "Output:\n", $output if $output;
        }
        exit $exitcode;
    }
    else {
        $proc.kill;
        say "Program @cmd[] did not finish after $timeout seconds";
        sleep 1 if $promise.status ~~ Planned;
        $proc.kill(9);
        $ = await $promise;
        exit 2;
    }
}

There's logic in there for executing external programs with a timeout, and then there's logic for dealing with two possible outcomes. In terms of both testability and for future extensions it makes sense to factor out the execution of external programs into a subroutine. The result of this code is not a single value, we're potentially interested in the output it produced, the exit code, and whether it ran into a timeout. We could write a subroutine that returns a list or a hash of these values, but here I chose to write a small class instead:

class ExecutionResult {
    has Int $.exitcode = -1;
    has Str $.output is required;
    has Bool $.timed-out = False;
    method is-success {
        !$.timed-out && $.exitcode == 0;
    }
}

We've seen classes before, but this one has a few new features. Attributes declared with the . twigil automatically get an accessor method, so

has Int $.exitcode;

is roughly the same as

has Int $!exitcode;
method exitcode() { $!exitcode }

So it allows a user of the class to access the value in the attribute from the outside. As a bonus, you can also initialize it from the standard constructor as a named argument, ExecutionResult.new( exitcode => 42 ). The exit code is not a required attribute, because we can't know the exit code of a program that has timed out. So with has Int $.exitcode = -1 we give it a default value that applies if the attribute hasn't been initialized.

The output is a required attribute, so we mark it as such with is required. That's a trait. Traits are pieces of code that modify the behavior of other things, here of an attribute. They crop up in several places, for example in subroutine signatures (is copy on a parameter), variable declarations and classes. If you try to call ExecutionResult.new() without specifying an output, you get such an error:

The attribute '$!output' is required, but you did not provide a value for it.

Mocking and Testing

Now that we have a convenient way to return more than one value from a hypothetical subroutine, let's look at what this subroutine might look like:

sub run-with-timeout(@cmd, :$timeout) {
    my $proc = Proc::Async.new(|@cmd);
    my $collector = Channel.new;
    for $proc.stdout, $proc.stderr -> $supply {
        $supply.tap: { $collector.send($_) }
    }
    my $promise = $proc.start;
    my $waitfor = $promise;
    $waitfor = Promise.anyof(Promise.in($timeout), $promise)
        if $timeout;
    $ = await $waitfor;

    $collector.close;
    my $output = $collector.list.join;

    if !$timeout || $promise.status ~~ Kept {
        say "No timeout";
        return ExecutionResult.new(
            :$output,
            :exitcode($promise.result.exitcode),
        );
    }
    else {
        $proc.kill;
        sleep 1 if $promise.status ~~ Planned;
        $proc.kill(9);
        $ = await $promise;
        return ExecutionResult.new(
            :$output,
            :timed-out,
        );
    }
}

The usage of Proc::Async has remained the same, but instead of printing this when an error occurs, the routine now returns ExecutionResult objects.

This simplifies the MAIN sub quite a bit:

multi sub MAIN(*@cmd, :$timeout) {
    my $result = run-with-timeout(@cmd, :$timeout);
    unless $result.is-success {
        say "Program @cmd[] ",
            $result.timed-out ?? "ran into a timeout"
                              !! "exited with code $result.exitcode()";

        print "Output:\n", $result.output if $result.output;
    }
    exit $result.exitcode // 2;
}

A new syntactic feature here is the ternary operator, CONDITION ?? TRUE-BRANCH !! FALSE-BRANCH, which you might know from other programming languages such as C or Perl 5 as CONDITION ? TRUE-BRANCH : FALSE-BRANCH.

Finally, the logical defined-or operator LEFT // RIGHT returns the LEFT side if it's defined, and if not, runs the RIGHT side and returns its value. It works like the || and or infix operators, except that those check for the boolean value of the left, not whether they are defined.

In Perl 6, we distinguish between defined and true values. By default, all instances are true and defined, and all type objects are false and undefined. Several built-in types override what they consider to be true. Numbers that equal 0 evaluate to False in a boolean context, as do empty strings and empty containers such as arrays, hashes and sets. On the other hand, only the built-in type Failure overrides definedness. You can override the truth value of a custom type by implementing a method Bool (which should return True or False), and the definedness with a method defined.

Now we could start testing the sub run-with-timeout by writing custom external commands with defined characteristics (output, run time, exit code), but that's rather fiddly to do so in a reliable, cross-platform way. So instead I want to replace Proc::Async with a mock implementation, and give the sub a way to inject that:

sub run-with-timeout(@cmd, :$timeout, :$executer = Proc::Async) {
    my $proc = $executer.defined ?? $executer !! $executer.new(|@cmd);
    # rest as before

Looking through sub run-with-timeout, we can make a quick list of methods that the stub Proc::Async implementation needs: stdout, stderr, start and kill. Both stdout and stderr need to return a Supply. The simplest thing that could possibly work is to return a Supply that will emit just a single value:

my class Mock::Proc::Async {
    has $.out = '';
    has $.err = '';
    method stdout {
        Supply.from-list($.out);
    }
    method stderr {
        Supply.from-list($.err);
    }

Supply.from-list returns a Supply that will emit all the arguments passed to it; in this case just a single string.

The simplest possible implementation of kill just does nothing:

    method kill($?) {}

$? in a signature is an optional argument ($foo?) without a name.

Only one method remains that needs to be stubbed: start. It's supposed to return a Promise that, after a defined number of seconds, returns a Proc object or a mock thereof. Since the code only calls the exitcode method on it, writing a stub for it is easy:

has $.exitcode = 0;
has $.execution-time = 1;
method start {
    Promise.in($.execution-time).then({
        (class {
            has $.exitcode;
        }).new(:$.exitcode);
    });
}

Since we don't need the class for the mock Proc anywhere else, we don't even need to give it a name. class { ... } creates an anonymous class, and the .new call on it creates a new object from it.

As mentioned before, a Proc with a non-zero exit code throws an exception when evaluated in void context, or sink context as we call it in Perl 6. We can emulate this behavior by extending the anonymous class a bit:

class {
    has $.exitcode;
    method sink() {
        die "mock Proc used in sink context";
    }
}

With all this preparation in place, we can finally write some tests:

multi sub MAIN('test') {
    use Test;

    my class Mock::Proc::Async {
        has $.exitcode = 0;
        has $.execution-time = 0;
        has $.out = '';
        has $.err = '';
        method kill($?) {}
        method stdout {
            Supply.from-list($.out);
        }
        method stderr {
            Supply.from-list($.err);
        }
        method start {
            Promise.in($.execution-time).then({
                (class {
                    has $.exitcode;
                    method sink() {
                        die "mock Proc used in sink context";
                    }
                }).new(:$.exitcode);
            });
        }
    }

    # no timeout, success
    my $result = run-with-timeout([],
        timeout => 2,
        executer => Mock::Proc::Async.new(
            out => 'mocked output',
        ),
    );
    isa-ok $result, ExecutionResult;
    is $result.exitcode, 0, 'exit code';
    is $result.output, 'mocked output', 'output';
    ok $result.is-success, 'success';

    # timeout
    $result = run-with-timeout([],
        timeout => 0.1,
        executer => Mock::Proc::Async.new(
            execution-time => 1,
            out => 'mocked output',
        ),
    );
    isa-ok $result, ExecutionResult;
    is $result.output, 'mocked output', 'output';
    ok $result.timed-out, 'timeout reported';
    nok $result.is-success, 'success';
}

This runs through two scenarios, one where a timeout is configured but not used (because the mocked external program exits first), and one where the timeout takes effect.

Improving Reliability and Timing

Relying on timing in tests is always unattractive. If the times are too short (or too slow together), you risk sporadic test failures on slow or heavily loaded machines. If you use more conservative temporal spacing of tests, the tests can become very slow.

There's a module (not distributed with Rakudo) to alleviate this pain: Test::Scheduler provides a thread scheduler with virtualized time, allowing you to write the tests like this:

use Test::Scheduler;
my $*SCHEDULER = Test::Scheduler.new;
my $result = start run-with-timeout([],
    timeout => 5,
    executer => Mock::Proc::Async.new(
        execution-time => 2,
        out => 'mocked output',
    ),
);
$*SCHEDULER.advance-by(5);
$result = $result.result;
isa-ok $result, ExecutionResult;
# more tests here

This installs the custom scheduler, and $*SCHEDULER.advance-by(5) instructs it to advance the virtual time by 5 seconds, without having to wait five actual seconds. At the time of writing (December 2016), Test::Scheduler is rather new module, and has a bug that prevents the second test case from working this way.

Installing a Module

If you want to try out Test::Scheduler, you need to install it first. If you run Rakudo Star, it has already provided you with the panda module installer. You can use that to download and install the module for you:

$ panda install Test::Scheduler

If you don't have panda available, you can instead bootstrap zef, an alternative module installer:

$ git clone https://github.com/ugexe/zef.git
$ cd zef
$ perl6 -Ilib bin/zef install .

and then use zef to install the module:

$ zef install Test::Scheduler

Summary

In this installment, we've seen attributes with accessors, the ternary operator and anonymous classes. Testing of threaded code has been discussed, and how a third-party module can help. Finally we had a very small glimpse at the two module installers, panda and zef.

[/perl-6] Permanent link

Categories