Categories

Posts in this category

Sun, 04 Dec 2016

Perl 6 By Example: Formatting a Sudoku Puzzle


Permanent link

This blog post is part of my ongoing project to write a book about Perl 6.

If you're interested, please sign up for the mailing list at the bottom of the article, or here. It will be low volume (less than an email per month, on average).


As a gentle introduction to Perl 6, let's consider a small task that I recently encountered while pursuing one of my hobbies.

Sudoku is a number-placement puzzle played on a grid of 9x9 cells, subdivided into blocks of 3x3. Some of the cells are filled out with numbers from 1 to 9, some are empty. The objective of the game is to fill out the empty cells so that in each row, column and 3x3 block, each digit from 1 to 9 occurs exactly once.

An efficient storage format for a Sudoku is simply a string of 81 characters, with 0 for empty cells and the digits 1 to 9 for pre-filled cells. The task I want to solve is to bring this into a friendlier format.

The input could be:

000000075000080094000500600010000200000900057006003040001000023080000006063240000

On to our first Perl 6 program:

# file sudoku.p6
use v6;
my $sudoku = '000000075000080094000500600010000200000900057006003040001000023080000006063240000';
for 0..8 -> $line-number {
    say substr $sudoku, $line-number * 9, 9;
}

You can run it like this:

$ perl6 sudoku.p6
000000075
000080094
000500600
010000200
000900057
006003040
001000023
080000006
063240000

There's not much magic in there, but let's go through the code one line at a time.

The first line, starting with a # is a comment that extends to the end of the line.

use v6;

This line is not strictly necessary, but good practice anyway. It declares the Perl version you are using, here v6, so any version of the Perl 6 language. We could be more specific and say use v6.c; to require exactly the version discussed here. If you ever accidentally run a Perl 6 program through Perl 5, you'll be glad you included this line, because it'll tell you:

$ perl sudoku.p6
Perl v6.0.0 required--this is only v5.22.1, stopped at sudoku.p6 line 1.
BEGIN failed--compilation aborted at sudoku.p6 line 1.

instead of the much more cryptic

syntax error at sudoku.p6 line 4, near "for 0"
Execution of sudoku.p6 aborted due to compilation errors.

The first interesting line is

my $sudoku = '00000007500...';

my declares a lexical variable. It is visible from the point of the declaration to the end of the current scope, which means either to the end of the current block delimited by curly braces, or to the end of the file if it's outside any block. As it is in this example.

Variables start with a sigil, here a '$'. Sigils are what gave Perl the reputation of being line noise, but there is signal in the noise. The $ looks like an S, which stands for scalar. If you know some math, you know that a scalar is just a single value, as opposed to a vector or even a matrix.

The variable doesn't start its life empty, because there's an initialization right next to it. The value it starts with is a string literal, as indicated by the quotes.

Note that there is no need to declare the type of the variable beyond the very vague "it's a scalar" implied by the sigil. If we wanted, we could add a type constraint:

my Str $sudoku = '00000007500...';

But when quickly prototyping, I tend to forego type constraints, because I often don't know yet how exactly the code will work out.

The actual logic happens in the next lines, by iterating over the line numbers 0 to 8:

for 0..8 -> $line-number {
    ...
}

The for loop has the general structure for ITERABLE BLOCK. Here the iterable is a range, and the block is a pointy block. The block starts with ->, which introduces a signature. The signature tells the compiler what arguments the blocks expects, here a single scalar called $line-number.

Perl 6 allows to use a dash - or a single quote ' to join multiple simple identifiers into a larger identifier. That means you can use them inside an identifier as long as the following letter is a letter or the underscore.

Again, type constraints are optional. If you chose to include them, it would be for 0..9 -> Int $line-number { ... }.

$line-number is again a lexical variable, and visible inside the block that comes after the signature. Blocks are delimited by curly braces.

say substr $sudoku, $line-number * 9, 9;

Both say and substr are functions provided by the Perl 6 standard library. substr($string, $start, $chars) extracts a substring of (up to) $chars characters length from $string, starting from index $start. Oh, and indexes are zero-based in Perl 6.

say then prints this substring, followed by a line break.

As you can see from the example, function invocations don't need parenthesis, though you can add them if you want:

say substr($sudoku, $line-number * 9, 9);

or even

say(substr($sudoku, $line-number * 9, 9));

Making the Sudoku playable

As the output of our script stands now, you can't play the resulting Sudoku even if you printed it, because all those pesky zeros get in your way of actually entering the numbers you carefully deduce while solving the puzzle.

So, let's substitute each 0 with a blank:

# file sudoku.p6
use v6;

my $sudoku = '000000075000080094000500600010000200000900057006003040001000023080000006063240000';
$sudoku = $sudoku.trans('0' => ' ');

for 0..8 -> $line-number {
    say substr $sudoku, $line-number * 9, 9;
}

trans is a method of the Str class. Its argument is a Pair. The boring way to create a Pair would be Pair.new('0', ' '), but since it's so commonly used, there is a shortcut in the form of the fat arrow, =>. The method trans replaces each occurrence of they pair's key with the pair's value, and returns the resulting string.

Speaking of shortcuts, you can also shorten $sudoku = $sudoku.trans(...) to $sudoku.=trans(...). This is a general pattern that turns methods that return a result into mutators.

With the new string substitution, the result is playable, but ugly:

$ perl6 sudoku.p6
       75
    8  94
   5  6  
 1    2  
   9   57
  6  3 4 
  1    23
 8      6
 6324    

A bit ASCII art makes it bearable:

+---+---+---+
|   | 1 |   |
|   |   |79 |
| 9 |   | 4 |
+---+---+---+
|   |  4|  5|
|   |   | 2 |
|3  | 29|18 |
+---+---+---+
|  4| 87|2  |
|  7|  2|95 |
| 5 |  3|  8|
+---+---+---+

To get the vertical dividing lines, we need to sub-divide the lines into smaller chunks. And since we already have one occurrence of dividing a string into smaller strings of a fixed size, it's time to encapsulate it into a function:

sub chunks(Str $s, Int $chars) {
    gather for 0 .. $s.chars / $chars - 1 -> $idx {
        take substr($s, $idx * $chars, $chars);
    }
}

for chunks($sudoku, 9) -> $line {
    say chunks($line, 3).join('|');
}

The output is:

$ perl6 sudoku.p6
   |   | 75
   | 8 | 94
   |5  |6  
 1 |   |2  
   |9  | 57
  6|  3| 4 
  1|   | 23
 8 |   |  6
 63|24 |   

But how did it work? Well, sub (SIGNATURE) BLOCK declares a subroutine, short sub. Here I declare it to take two arguments, and since I tend to confuse the order of arguments to functions I call, I've added type constraints that make it very likely that Perl 6 catches the error for me.

gather and take work together to create a list. gather is the entry point, and each execution of take adds one element to the list. So

gather {
    take 1;
    take 2;
}

would return the list 1, 2. Here gather acts as a statement prefix, which means it collects all takes from within the for loop.

A subroutine returns the value from the last expression, which here is the gather for ... thing discussed above.

Coming back to the program, the for-loop now looks like this:

for chunks($sudoku, 9) -> $line {
    say chunks($line, 3).join('|');
}

So first the program chops up the full Sudoku string into lines of nine characters, and then for each line, again into a list of three strings of three characters length. The join method turns it back into a string, but with pipe symbols inserted between the chunks.

There are still vertical bars missing at the start and end of the line, which can easily be hard-coded by changing the last line:

    say '|', chunks($line, 3).join('|'), '|';

Now the output is

|   |   | 75|
|   | 8 | 94|
|   |5  |6  |
| 1 |   |2  |
|   |9  | 57|
|  6|  3| 4 |
|  1|   | 23|
| 8 |   |  6|
| 63|24 |   |

Only the horizontal lines are missing, which aren't too hard to add:

my $separator = '+---+---+---+';
my $index = 0;
for chunks($sudoku, 9) -> $line {
    if $index++ %% 3 {
        say $separator;
    }
    say '|', chunks($line, 3).join('|'), '|';
}
say $separator;

Et voila:

+---+---+---+
|   |   | 75|
|   | 8 | 94|
|   |5  |6  |
+---+---+---+
| 1 |   |2  |
|   |9  | 57|
|  6|  3| 4 |
+---+---+---+
|  1|   | 23|
| 8 |   |  6|
| 63|24 |   |
+---+---+---+

There are two new aspects here: the if conditional, which structurally very much resembles the for loop. The second new aspect is the divisibility operator, %%. From other programming languages you probably know % for modulo, but since $number % $divisor == 0 is such a common pattern, $number %% $divisor is Perl 6's shortcut for it.

Shortcuts, Constants, and more Shortcuts

Perl 6 is modeled after human languages, which have some kind of compression scheme built in, where commonly used words tend to be short, and common constructs have shortcuts.

As such, there are lots of ways to write the code more succinctly. The first is basically cheating, because the sub chunks can be replaced by a built-in method in the Str class, comb:

# file sudoku.p6
use v6;

my $sudoku = '000000075000080094000500600010000200000900057006003040001000023080000006063240000';
$sudoku = $sudoku.trans: '0' => ' ';

my $separator = '+---+---+---+';
my $index = 0;
for $sudoku.comb(9) -> $line {
    if $index++ %% 3 {
        say $separator;
    }
    say '|', $line.comb(3).join('|'), '|';
}
say $separator;

The if conditional can be applied as a statement postfix:

say $separator if $index++ %% 3;

Except for the initialization, the variable $index is used only once, so there's no need to give it name. Yes, Perl 6 has anonymous variables:

my $separator = '+---+---+---+';
for $sudoku.comb(9) -> $line {
    say $separator if $++ %% 3;
    say '|', $line.comb(3).join('|'), '|';
}
say $separator;

Since $separator is a constant, we can declare it as one:

`constant $separator = '+---+---+---+';

If you want to reduce the line noise factor, you can also forego the sigil, so constant separator = '...'.

Finally there is a another syntax for method calls with arguments: instead of $obj.method(args) you can say $obj.method: args, which brings us to the idiomatic form of the small Sudoku formatter:

# file sudoku.p6
use v6;

my $sudoku = '000000075000080094000500600010000200000900057006003040001000023080000006063240000';
$sudoku = $sudoku.trans: '0' => ' ';

constant separator = '+---+---+---+';
for $sudoku.comb(9) -> $line {
    say separator if $++ %% 3;
    say '|', $line.comb(3).join('|'), '|';
}
say separator;

IO and other Tragedies

A practical script doesn't contain its input as a hard-coded string literal, but reads it from the command line, standard input or a file.

If you want to read the Sudoku from the command line, you can declare a subroutine called MAIN, which gets all command line arguments passed in:

# file sudoku.p6
use v6;

constant separator = '+---+---+---+';

sub MAIN($sudoku) {
    my $substituted = $sudoku.trans: '0' => ' ';

    for $substituted.comb(9) -> $line {
        say separator if $++ %% 3;
        say '|', $line.comb(3).join('|'), '|';
    }
    say separator;
}

This is how it's called:

$ perl6-m sudoku-format-08.p6 000000075000080094000500600010000200000900057006003040001000023080000006063240000
+---+---+---+
|   |   | 75|
|   | 8 | 94|
|   |5  |6  |
+---+---+---+
| 1 |   |2  |
|   |9  | 57|
|  6|  3| 4 |
+---+---+---+
|  1|   | 23|
| 8 |   |  6|
| 63|24 |   |
+---+---+---+

And you even get a usage message for free if you use it wrongly, for example by omitting the argument:

$ perl6-m sudoku.p6 
Usage:
  sudoku.p6 <sudoku> 

You might have noticed that the last example uses a separate variable for the substituted Sudoku string.This is because function parameters (aka variables declared in a signature) are read-only by default. Instead of creating a new variable, I could have also written sub MAIN($sudoku is copy) { ... }.

Classic UNIX programs such as cat and wc, follow the convention of reading their input from file names given on the command line, or from the standard input if no file names are given on the command line.

If you want your program to follow this convention, lines() provides a stream of lines from either of these source:

# file sudoku.p6
use v6;

constant separator = '+---+---+---+';

for lines() -> $sudoku {
    my $substituted = $sudoku.trans: '0' => ' ';

    for $substituted.comb(9) -> $line {
        say separator if $++ %% 3;
        say '|', $line.comb(3).join('|'), '|';
    }
    say separator;
}

Get Creative!

You won't learn a programming language from reading a blog, you have to actually use it, tinker with it. If you want to expand on the examples discussed earlier, I'd encourage you to try to produce Sudokus in different output formats.

SVG offers a good ratio of result to effort. This is the rough skeleton of an SVG file for a Sudoku:

<?xml version="1.0" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg width="304" height="304" version="1.1"
xmlns="http://www.w3.org/2000/svg">
    <line x1="0" x2="300" y1="33.3333" y2="33.3333" style="stroke:grey" />
    <line x1="0" x2="300" y1="66.6667" y2="66.6667" style="stroke:grey" />
    <line x1="0" x2="303" y1="100" y2="100" style="stroke:black;stroke-width:2" />
    <line x1="0" x2="300" y1="133.333" y2="133.333" style="stroke:grey" />
    <!-- more horizontal lines here -->

    <line y1="0" y2="300" x1="33.3333" x2="33.3333" style="stroke:grey" />
    <!-- more vertical lines here -->


    <text x="43.7333" y="124.5"> 1 </text>
    <text x="43.7333" y="257.833"> 8 </text>
    <!-- more cells go here -->
    <rect width="304" height="304" style="fill:none;stroke-width:1;stroke:black;stroke-width:6"/>
</svg>

If you have a Firefox or Chrome browser, you can use it to open the SVG file.

If you are adventurous, you could also write a Perl 6 program that renders the Sudoku as a Postscript (PS) or Embedded Postscript document. It's also a text-based format.

Subscribe to the Perl 6 book mailing list

* indicates required

[/perl-6] Permanent link