Categories

Ads

Your advertisement could be here -- Contact me!.

Wed, 10 Dec 2014

New Perl 6 community server now live, accepting signups


Permanent link

The new Perl 6 community server is now alive and kicking.

As planned, I've set up KVM virtualization, and so far there are two guest systems. hack.p6c.org is meant for general Perl 6 development activity (which also includes irssi/weechat sessions), and is equipped with 20GB RAM to handle multiple concurrent rakudo-jvm compilations :-). It runs a pretty bare-bones Debian Jessie.

www.p6c.org is the web server where I plan to host perl6.org and related (sub-)domains. It's not as beefy as hack, but sufficiently large to compile and run Rakudo, in preparation for future Perl 6-based web hosting. Currently I'm running a copy of several perl6.org subdomains on it (with the domain name p6c instead of perl6 for test purposes); the plan is to switch the perl6.org DNS over once all of the websites have been copied/migrated.

If you have a Perl 6 related use for a shell account or for serving websites, please request an account by email (moritz@faui2k3.org) or IRC (moritz on freenode and magnet), including:

  1. Your desired username
  2. What you want to do on the machine(s) (not necessary for #perl6 regulars)
  3. Which of the machine(s) you need access to
  4. Optionally an openssh public key
  5. Whether you'd be willing to help a bit with sysadmin tasks (mostly apt-get update && apt-get dist-upgrade, restarting hung services, killing huge processes)
  6. Software you need installed (it's OK to not know this up-front)

Note that feather.perl6.nl will shut down soon (no fixed date yet, but "end of 2014" is expected), so if you rely on feather now, you should consider migrating to the new server.

The code of conduct is pretty simple:

  1. Be reasonable in your resource usage.
  2. Use technical means to limit your resource usage so that it doesn't accidentally explode (ulimit comes to mind).
  3. Limit yourself to legal and Perl 6-related use cases (no warez).
  4. Help your fellow hackers.

The standard disclaimer applies:

  • Expect no privacy. There will potentially be many root users, who could all read your files and memory.
  • There are no promises of continued service or even support. Your account can be terminated without notice.
  • Place of jurisdiction in Nürnberg, Germany. You have to comply with German law while using the server. (Note that this puts pretty high standards on privacy for any user data you collect, including from web applications). It's your duty to inform yourself about the applicable laws. Illegal activities will be reported to the authorities.

With all that said, happy hacking!.

[/perl-6] Permanent link

comments / trackbacks

Tue, 02 Dec 2014

A new Perl 6 community server - update


Permanent link

In my previous post I announced my plans for a new Perl 6 community server (successor to feather.perl6.nl), and now I'd like to share some updates.

Thanks to the generosity of the Perl 6 community, the server has been ordered and paid. I am now in the process of contacting those donors who haven't paid yet, leaving them the choice to re-purpose their pledge to ongoing costs (traffic, public IPv4 addresses, domain(s), SSL certs if necessary) and maintenance, or withdraw their pledges.

Some details of the hardware we'll get:

  • CPU: Intel® Xeon® Haswell-EP Series Processor E5-2620 v3, 2.40 GHz, 6-Core Socket 2011-3, 15MB Cache
  • RAM: 4x8GB DDR4 DDR4 PC2133 Reg. ECC 2R
  • HD: 2x 2TB SATA3-HD

The vendor has told me that all parts have arrived, and will be assembled today or tomorrow.

Currently I lean towards using KVM to create three virtual hosts: one for websites (*.perl6.org, perlcabal.syn), one for general hacking and IRC activity, and one for high-risk stuff (evalbots, try.rakudo.org, ...).

I've secured the domain p6c.org (for "perl 6 community"), and the IPv4 range 213.95.82.52 - 213.95.82.62 and the IPv6 net 2001:780:101:ff00::/64.

So the infrastructure is in place, now I'm waiting for the delivery of the hardware.

[/perl-6] Permanent link

comments / trackbacks

Wed, 05 Nov 2014

A new Perl 6 community server - call for funding


Permanent link

So far, many Perl 6 developers have used feather as a generic development server. Juerd, who has genereously provided this server for us for free for many years, has announced that it will be shut down at the end of the year.

My daytime job is at a b2b IT outsourcing and hosting company called noris network, and they have agreed to sponsor the hosting/housing of a 1U 19" server in one of their state-of-the-art data centers in Nürnberg, Germany.

What's missing is the actual hardware. Some folks in the community have already agreed to participate in funding the hardware, though I have few concrete pledges.

So here is the call to action: If you want to help the Perl 6 community with a one-time donation towards a new community server, please send me an e-mail to moritz at faui2k3 dot org, specifying the amount you're willing do pledge, and whether you want to stay private as a donor. I accept money transfer by paypal and wire transfer (SWIFT). Direct hardware donations are also welcome. (Though actual money will be deferred until the final decision what hardware to buy, and thus the total amount required).

How much do we need?

Decent, used 1U servers seem to start at about 250€, though 350€ would get us a lot more bang (mostly RAM and hard disk space). And in general, the more the merrier. (Cheaper offers exist, for example on ebay, but usually they are without hard disks, so the need for extra drives makes them more expensive in total).

With more money, even beefier hardware and/or spare parts and/or a maintainance contract and/new hardware would be an option.

What do we need it for?

The main tasks for the server are:

  • Hosting websites like perl6.org and the synopses
  • Hosting infrastructure like the panda metadata server
  • Be available for smoke runs of the compilers, star distributions and module ecosystem.
  • Be available as a general development machine for people who don't have linux available and/or not enough resources to build some Perl 6 compilers on their own machines comfortably.
  • A place for IRC sessions for community memebers
  • A backup location for community services like the IRC logs, the camelia IRC eval bot etc. Those resources are currently hosted elswewhere, though having another option for hosting would be very valuable.
  • A webspace for people who want to host Perl 6-related material.
  • It is explicitly not meant as a general hosting platform, nor as a mail server.

    Configuration

    If the hardware we get is beefy enough, I'd like to virtualize the server into two to three components. One for hosting the perl6.org and related websites that should be rather stable, and one for the rest of the system. If resources allow it, and depending on feedback I get, maybe a third virtual system for high-risk stuff like evalbot.

    As operating system I'll install Debian Jessie (the current testing), simply because I'll end up maintaing the system, and it's the system I'm most familiar with.

[/perl-6] Permanent link

comments / trackbacks

Sat, 15 Feb 2014

The Fun of Running a Public Web Service, and Session Storage


Permanent link

One of my websites, Sudokugarden, recently surged in traffic, from about 30k visitors per month to more than 100k visitors per month. Here's the tale of what that meant for the server side.

As a bit of background, I built the website in 2007, when I knew a lot less about the web and programming. It runs on a host that I share with a few friends; I don't have root access on that machine, though when the admin is available, I can generally ask him to install stuff for me.

Most parts of the websites are built as static HTML files, with Server Side Includes. Parts of those SSIs are Perl CGI scripts. The most popular part though, which allows you to solve Sudoku in the browser and keeps hiscores, is written as a collection of Perl scripts, backed by a mysql database.

When at peak times the site had more than 10k visitors a day, lots of visitors would get a nasty mysql: Cannot connect: Too many open connections error. The admin wasn't available for bumping the connection limit, so I looked for other solutions.

My first action was to check the logs for spammers and crawlers that might hammered the page, and I found and banned some; but the bulk of the traffic looked completely legitimate, and the problem persisted.

Looking at the seven year old code, I realized that most pages didn't actually need a database connection, if only I could remove the session storage from the database. And, in fact, I could. I used CGI::Session, which has pluggable backend. Switching to a file-based session backend was just a matter of changing the connection string and adding a directory for session storage. Luckily the code was clean enough that this only affected a single subroutine. Everything was fine.

For a while.

Then, about a month later, the host ran out of free disk space. Since it is used for other stuff too (like email, and web hosting for other users) it took me a while to make the connection to the file-based session storage. What happened was 3 million session files on a ext3 file system with a block size of 4 kilobyte. A session is only about 400 byte, but since a file uses up a multiple of the block size, the session storage amounted to 12 gigabyte of used-up disk space, which was all that was left on that machine.

Deleting those sessions turned out to be a problem; I could only log in as my own user, which doesn't have write access to the session files (which are owned by www-data, the Apache user). The solution was to upload a CGI script that deleted the session, but of course that wasn't possible at first, because the disk was full. In the end I had to delete several gigabyte of data from my home directory before I could upload anything again. (Processes running as root were still writing to reserved-to-root portions of the file system, which is why I had to delete so much data before I was able to write again).

Even when I was able to upload the deletion script, it took quite some time to actually delete the session files; mostly because the directory was too large, and deleting files on ext3 is slow. When the files were gone, the empty session directory still used up 200MB of disk space, because the directory index doesn't shrink on file deletion.

Clearly a better solution to session storage was needed. But first I investigated where all those sessions came from, and banned a few spamming IPs. I also changed the code to only create sessions when somebody logs in, not give every visitor a session from the start.

My next attempt was to write the sessions to an SQLite database. It uses about 400 bytes per session (plus a fixed overhead for the db file itself), so it uses only a tenth of storage space that the file-based storage used. The SQLite database has no connection limit, though the old-ish version that was installed on the server doesn't seem to have very fine-grained locking either; within a few days I could errors that the session database was locked.

So I added another layer of workaround: creating a separate session database per leading IP octet. So now there are up to 255 separate session database (plus a 256th for all IPv6 addresses; a decision that will have to be revised when IPv6 usage rises). After a few days of operation, it seems that this setup works well enough. But suspicious as I am, I'll continue monitoring both disk usage and errors from Apache.

So, what happens if this solution fails to work out? I can see basically two approaches: move the site to a server that's fully under my control, and use redis or memcached for session storage; or implement sessions with signed cookies that are stored purely on the client side.

[/misc] Permanent link

comments / trackbacks

Sat, 19 Oct 2013

A small regex optimization for NQP and Rakudo


Permanent link

Recently I read the course material of the Rakudo and NQP Internals Workshop, and had an idea for a small optimization for the regex engine. Yesterday night I implemented it, and I'd like to walk you through the process.

As a bit of background, the regex engine that Rakudo uses is actually implemented in NQP, and used by NQP too. The code I am about to discuss all lives in the NQP repository, but Rakudo profits from it too.

In addition one should note that the regex engine is mostly used for parsing grammar, a process which involves nearly no scanning. Scanning is the process where the regex engine first tries to match the regex at the start of the string, and if it fails there, moves to the second character in the string, tries again etc. until it succeeds.

But regexes that users write often involve scanning, and so my idea was to speed up regexes that scan, and where the first thing in the regex is a literal. In this case it makes sense to find possible start positions with a fast string search algorithm, for example the Boyer-Moore algorithm. The virtual machine backends for NQP already implement that as the index opcode, which can be invoked as start = index haystack, needle, startpos, where the string haystack is searched for the substring needle, starting from position startpos.

From reading the course material I knew I had to search for a regex type called scan, so that's what I did:

$ git grep --word scan
3rdparty/libtommath/bn_error.c:   /* scan the lookup table for the given message
3rdparty/libtommath/bn_mp_cnt_lsb.c:   /* scan lower digits until non-zero */
3rdparty/libtommath/bn_mp_cnt_lsb.c:   /* now scan this digit until a 1 is found
3rdparty/libtommath/bn_mp_prime_next_prime.c:                   /* scan upwards 
3rdparty/libtommath/changes.txt:       -- Started the Depends framework, wrote d
src/QRegex/P5Regex/Actions.nqp:                     QAST::Regex.new( :rxtype<sca
src/QRegex/P6Regex/Actions.nqp:                     QAST::Regex.new( :rxtype<sca
src/vm/jvm/QAST/Compiler.nqp:    method scan($node) {
src/vm/moar/QAST/QASTRegexCompilerMAST.nqp:    method scan($node) {
Binary file src/vm/moar/stage0/NQPP6QRegexMoar.moarvm matches
Binary file src/vm/moar/stage0/QASTMoar.moarvm matches
src/vm/parrot/QAST/Compiler.nqp:    method scan($node) {
src/vm/parrot/stage0/P6QRegex-s0.pir:    $P5025 = $P5024."new"("scan" :named("rx
src/vm/parrot/stage0/QAST-s0.pir:.sub "scan" :subid("cuid_135_1381944260.6802") 
src/vm/parrot/stage0/QAST-s0.pir:    push $P5004, "scan"

The binary files and .pir files are generated code included just for bootstrapping, and not interesting for us. The files in 3rdparty/libtommath are there for bigint handling, thus not interesting for us either. The rest are good matches: src/QRegex/P6Regex/Actions.nqp is responsible for compiling Perl 6 regexes to an abstract syntax tree (AST), and src/vm/parrot/QAST/Compiler.nqp compiles that AST down to PIR, the assembly language that the Parrot Virtual Machine understands.

So, looking at src/QRegex/P6Regex/Actions.nqp the place that mentions scan looked like this:

    $block<orig_qast> := $qast;
    $qast := QAST::Regex.new( :rxtype<concat>,
                 QAST::Regex.new( :rxtype<scan> ),
                 $qast,
                 ($anon
                      ?? QAST::Regex.new( :rxtype<pass> )
                      !! (nqp::substr(%*RX<name>, 0, 12) ne '!!LATENAME!!'
                            ?? QAST::Regex.new( :rxtype<pass>, :name(%*RX<name>) )
                            !! QAST::Regex.new( :rxtype<pass>,
                                   QAST::Var.new(
                                       :name(nqp::substr(%*RX<name>, 12)),
                                       :scope('lexical')
                                   ) 
                               )
                          )));

So to make the regex scan, the AST (in $qast) is wrapped in QAST::Regex.new(:rxtype<concat>,QAST::Regex.new( :rxtype<scan> ), $qast, ...), plus some stuff I don't care about.

To make the optimization work, the scan node needs to know what to scan for, if the first thing in the regex is indeed a constant string, aka literal. If it is, $qast is either directly of rxtype literal, or a concat node where the first child is a literal. As a patch, it looks like this:

--- a/src/QRegex/P6Regex/Actions.nqp
+++ b/src/QRegex/P6Regex/Actions.nqp
@@ -667,9 +667,21 @@ class QRegex::P6Regex::Actions is HLL::Actions {
     self.store_regex_nfa($code_obj, $block, QRegex::NFA.new.addnode($qast))
     self.alt_nfas($code_obj, $block, $qast);
 
+    my $scan := QAST::Regex.new( :rxtype<scan> );
+    {
+        my $q := $qast;
+        if $q.rxtype eq 'concat' && $q[0] {
+            $q := $q[0]
+        }
+        if $q.rxtype eq 'literal' {
+            nqp::push($scan, $q[0]);
+            $scan.subtype($q.subtype);
+        }
+    }
+
     $block<orig_qast> := $qast;
     $qast := QAST::Regex.new( :rxtype<concat>,
-                 QAST::Regex.new( :rxtype<scan> ),
+                 $scan,
                  $qast,

Since concat nodes have always been empty so far, the code generators don't look at their child nodes, and adding one with nqp::push($scan, $q[0]); won't break anything on backends that don't support this optimization yet (which after just this patch were all of them). Running make test confirmed that.

My original patch did not contain the line $scan.subtype($q.subtype);, and later on some unit tests started to fail, because regex matches can be case insensitive, but the index op works only case sensitive. For case insensitive matches, the $q.subtype of the literal regex node would be ignorecase, so that information needs to be carried on to the code generation backend.

Once that part was in place, and some debug nqp::say() statements confirmed that it indeed worked, it was time to look at the code generation. For the parrot backend, it looked like this:

    method scan($node) {
        my $ops := self.post_new('Ops', :result(%*REG<cur>));
        my $prefix := self.unique('rxscan');
        my $looplabel := self.post_new('Label', :name($prefix ~ '_loop'));
        my $scanlabel := self.post_new('Label', :name($prefix ~ '_scan'));
        my $donelabel := self.post_new('Label', :name($prefix ~ '_done'));
        $ops.push_pirop('repr_get_attr_int', '$I11', 'self', %*REG<curclass>, '"$!from"');
        $ops.push_pirop('ne', '$I11', -1, $donelabel);
        $ops.push_pirop('goto', $scanlabel);
        $ops.push($looplabel);
        $ops.push_pirop('inc', %*REG<pos>);
        $ops.push_pirop('gt', %*REG<pos>, %*REG<eos>, %*REG<fail>);
        $ops.push_pirop('repr_bind_attr_int', %*REG<cur>, %*REG<curclass>, '"$!from"', %*REG<pos>);
        $ops.push($scanlabel);
        self.regex_mark($ops, $looplabel, %*REG<pos>, 0);
        $ops.push($donelabel);
        $ops;
    }

While a bit intimidating at first, staring at it for a while quickly made clear what kind of code it emits. First three labels are generated, to which the code can jump with goto $label: One as a jump target for the loop that increments the cursor position ($looplabel), one for doing the regex match at that position ($scanlabel), and $donelabel for jumping to when the whole thing has finished.

Inside the loop there is an increment (inc) of the register the holds the current position (%*REG<pos>), that position is compared to the end-of-string position (%*REG<eos>), and if is larger, the cursor is marked as failed.

So the idea is to advance the position by one, and then instead of doing the regex match immediately, call the index op to find the next position where the regex might succeed:

--- a/src/vm/parrot/QAST/Compiler.nqp
+++ b/src/vm/parrot/QAST/Compiler.nqp
@@ -1564,7 +1564,13 @@ class QAST::Compiler is HLL::Compiler {
         $ops.push_pirop('goto', $scanlabel);
         $ops.push($looplabel);
         $ops.push_pirop('inc', %*REG<pos>);
-        $ops.push_pirop('gt', %*REG<pos>, %*REG<eos>, %*REG<fail>);
+        if nqp::elems($node.list) && $node.subtype ne 'ignorecase' {
+            $ops.push_pirop('index', %*REG<pos>, %*REG<tgt>, self.rxescape($node[0]), %*REG<pos>);
+            $ops.push_pirop('eq', %*REG<pos>, -1, %*REG<fail>);
+        }
+        else {
+            $ops.push_pirop('gt', %*REG<pos>, %*REG<eos>, %*REG<fail>);
+        }
         $ops.push_pirop('repr_bind_attr_int', %*REG<cur>, %*REG<curclass>, '"$!from"', %*REG<pos>);
         $ops.push($scanlabel);
         self.regex_mark($ops, $looplabel, %*REG<pos>, 0);

The index op returns -1 on failure, so the condition for a cursor fail are slightly different than before.

And as mentioned earlier, the optimization can only be safely done for matches that don't ignore case. Maybe with some additional effort that could be remedied, but it's not as simple as case-folding the target string, because some case folding operations can change the string length (for example ß becomes SS while uppercasing).

After successfully testing the patch, I came up with a small, artifical benchmark designed to show a difference in performance for this particular case. And indeed, it sped it up from 647 ± 28 µs to 161 ± 18 µs, which is roughly a factor of four.

You can see the whole thing as two commits on github.

What remains to do is implementing the same optimization on the JVM and MoarVM backends, and of course other optimizations. For example the Perl 5 regex engine keeps track of minimal and maximal string lengths for each subregex, and can anchor a regex like /a?b?longliteral/ to 0..2 characters before a match of longliteral, and generally use that meta information to fail faster.

But for now I am mostly encouraged that doing a worthwhile optimization was possible in a single evening without any black magic, or too intimate knowledge of the code generation.

Update: the code generation for MoarVM now also uses the index op. The logic is the same as for the parrot backend, the only difference is that the literal needs to be loaded into a register (whose name fresh_s returns) before index_s can use it.

[/perl-6] Permanent link

comments / trackbacks

Wed, 14 Aug 2013

YAPC Europe 2013 Day 3


Permanent link

The second day of YAPC Europe climaxed in the river boat cruise, Kiev's version of the traditional conference dinner. It was a largish boat traveling on the Dnipro river, with food, drinks and lots of Perl folks. Not having fixed tables, and having to get up to fetch food and drinks led to a lot of circulation, and thus meeting many more people than at traditionally dinners. I loved it.

Day 3 started with a video message from next year's YAPC Europe organizers, advertising for the upcoming conference and talking a bit about the oppurtunities that Sofia offers. Tempting :-).

Monitoring with Perl and Unix::Statgrab was more about the metrics that are available for monitoring, and less about doing stuff with Perl. I was a bit disappointed.

The "Future Perl Versioning" Discussion was a very civilized discussion, with solid arguments. Whether anybody changed their minds remain to be seen.

Carl Mäsak gave two great talks: one on reactive programming, and one on regular expressions. I learned quite a bit in the first one, and simply enjoyed the second one.

After the lunch (tasty again), I attended Jonathan Worthington's third talk, MoarVM: a metamodel-focused runtime for NQP and Rakudo. Again this was a great talk, based on great work done by Jonathan and others during the last 12 months or so. MoarVM is a virtual machine designed for Perl 6's needs, as we understand them now (as opposed to parrot, which was designed towards Perl 6 as it was understood around 2003 or so, which is considerably different).

How to speak manager was both amusing and offered a nice perspective on interactions between managers and programmers. Some of this advice assumed a non-tech-savy manager, and thus didn't quite apply to my current work situation, but was still interesting.

I must confess I don't remember too much of the rest of the talks that evening. I blame five days of traveling, hackathon and conference taking their toll on me.

The third session of lightning talks was again an interesting mix, containing interesting technical tidbits, the usual "we are hiring" slogans, some touching and thoughtful moments, and finally a song by Piers Cawley. He had written the lyrics in the previous 18 hours (including sleep), to (afaict) a traditional irish song. Standing up in front of ~300 people and singing a song that you haven't really had time to practise takes a huge amount of courage, and I admire Piers both for his courage and his great performance. I hope it was recorded, and makes it way to the public soon.

Finally the organizers spoke some closing words, and received their well-deserved share of applause.

As you might have guess from this and the previous blog posts, I enjoyed this year's YAPC Europe very much, and found it well worth attending, and well organized. I'd like to give my heart-felt thanks to everybody who helped to make it happen, and to my employer for sending me there.

This being only my second YAPC, I can't make any far-reaching comparisons, but compared to YAPC::EU 2010 in Pisa I had an easier time making acquaintances. I cannot tell what the big difference was, but the buffet-style dinners at the pre-conference meeting and the river boat cruise certainly helped to increase the circulation and thus the number of people I talked to.

[/perl-6] Permanent link

comments / trackbacks

Tue, 13 Aug 2013

YAPC Europe 2013 Day 2


Permanent link

The second day of YAPC Europe was enjoyable and informative.

I learned about ZeroMQ, which is a bit like sockets on steriods. Interesting stuff. Sadly Design decisions on p2 didn't quite qualify as interesting.

Matt's PSGI archive is a project to rewrite Matt's infamous script archive in modern Perl. Very promising, and a bit entertaining too.

Lunch was very tasty, more so than the usual mass catering. Kudos to the organizers!

After lunch, jnthn talked about concurrency, parallelism and asynchrony in Perl 6. It was a great talk, backed by great work on the compiler and runtime. Jonathans talk are always to be recommended.

I think I didn't screw up my own talk too badly, at least the timing worked fine. I just forgot to show the last slide. No real harm done.

I also enjoyed mst's State of the Velociraptor, which was a summary of what went on in the Perl world in the last year. (Much better than the YAPC::EU 2010 talk with the same title).

The Lightning talks were as enjoyable as those from the previous day. So all fine!

Next up is the river cruise, I hope to blog about that later on.

[/perl-6] Permanent link

comments / trackbacks

Mon, 12 Aug 2013

First day at YAPC::Europe 2013 in Kiev


Permanent link

Today was the first "real" day of YAPC Europe 2013 in Kiev. In the same sense that it was the first real day, we had quite a nice "unreal" conference day yesterday, with a day-long Perl 6 hackathon, and in the evening a pre-conference meeting a Sovjet-style restaurant with tasty food and beverages.

The talks started with a few words of welcome, and then the announcement that the YAPC Europe next year will be in Sofia, Bulgaria, with the small side note that there were actually three cities competing for that honour. Congratulations to Sofia!

Larry's traditional keynote was quite emotional, and he had to fight tears a few times. Having had cancer and related surgeries in the past year, he still does his perceived duty to the Perl community, which I greatly appreciate.

Afterwards Dave Cross talked about 25 years of Perl in 25 minutes, which was a nice walk through some significant developments in the Perl world, though a bit hasty. Maybe picking fewer events and spending a bit more time on the selected few would give a smoother experience.

Another excellent talk that ran out of time was on Redis. Having experimented a wee bit with Redis in the past month, this was a real eye-opener on the wealth of features we might have used for a project at work, but in the end we didn't. Maybe we will eventually revise that decision.

Ribasushi talked about how hard benchmarking really is, and while I was (in principle) aware of that fact that it's hard to get right, there were still several significant factors that I overlooked (like the CPU's tendency to scale frequency in response to thermal and power-management considerations). I also learned that I should use Dumbbench instead of the Benchmark.pm core module. Sadly it didn't install for me (Capture::Tiny tests failing on Mac OS X).

The Perl 6 is dead, long live Perl 5 talk was much less inflammatory than the title would suggest (maybe due to Larry touching on the subject briefly during the keynote). It was mostly about how Perl 5 is used in the presenter's company, which was mildly interesting.

After tasty free lunch I attended jnthn's talk on Rakudo on the JVM, which was (as is typical for jnthn's talk) both entertaining and taught me something, even though I had followed the project quite a bit.

Thomas Klausner's Bread::Board by example made me want to refactor the OTRS internals very badly, because it is full of the anti-patterns that Bread::Board can solve in a much better way. I think that the OTRS code base is big enough to warrant the usage of Bread::Board.

I enjoyed Denis' talk on Method::Signatures, and was delighted to see that most syntax is directly copied from Perl 6 signature syntax. Talk about Perl 6 sucking creativity out of Perl 5 development.

The conference ended with a session of lighning talks, something which I always enjoy. Many lightning talks had a slightly funny tone or undertone, while still talking about interesting stuff.

Finally there was the "kick-off party", beverages and snacks sponsored by booking.com. There (and really the whole day, and yesterday too) I not only had conversations with my "old" Perl 6 friends, but also talked with many interesting people I never met before, or only met online before.

So all in all it was a nice experience, both from the social side, and from quality and contents of the talks. Venue and food are good, and the wifi too, except when it stops working for a few minutes.

I'm looking forward to two more days of conference!

(Updated: Fixed Thomas' last name)

[/perl-6] Permanent link

comments / trackbacks

Wed, 17 Apr 2013

The REPL trick


Permanent link

A recent discussion on IRC prompted me to share a small but neat trick with you.

If there are things you want to do quite often in the Rakudo REPL (the interactive "Read-Evaluate-Print Loop"), it makes sense to create a shortcut for them. And creating shortcuts for often-used stuff is what programming languages excel at, so you do it right in Perl module:

use v6;
module REPLHelper;

sub p(Mu \x) is export {
    x.^mro.map: *.^name;
}

I have placed mine in $HOME/.perl6/repl.

And then you make sure it's loaded automatically:

$ alias p6repl="perl6 -I$HOME/.perl6/repl/ -MREPLHelper"
$ p6repl
> p Int
Int Cool Any Mu
>

Now you have a neat one-letter function which tells you the parents of an object or a type, in method resolution order. And a way to add more shortcuts when you need them.

[/perl-6] Permanent link

comments / trackbacks

Sun, 31 Mar 2013

Rakudo's Abstract Syntax Tree


Permanent link

After or while a compiler parses a program, the compiler usually translates the source code into a tree format called Abstract Syntax Tree, or AST for short.

The optimizer works on this program representation, and then the code generation stage turns it into a format that the platform underneath it can understand. Actually I wanted to write about the optimizer, but noticed that understanding the AST is crucial to understanding the optimizer, so let's talk about the AST first.

The Rakudo Perl 6 Compiler uses an AST format called QAST. QAST nodes derive from the common superclass QAST::Node, which sets up the basic structure of all QAST classes. Each QAST node has a list of child nodes, possibly a hash map for unstructured annotations, an attribute (confusingly) named node for storing the lower-level parse tree (which is used to extract line numbers and context), and a bit of extra infrastructure.

The most important node classes are the following:

QAST::Stmts
A list of statements. Each child of the node is considered a separate statement.
QAST::Op
A single operation that usually maps to a primitive operation of the underlying platform, like adding two integers, or calling a routine.
QAST::IVal, QAST::NVal, QAST::SVal
Those hold integer, float ("numeric") and string constants respectively.
QAST::WVal
Holds a reference to a more complex object (for example a class) which is serialized separately.
QAST::Block
A list of statements that introduces a separate lexical scope.
QAST::Var
A variable
QAST::Want
A node that can evaluate to different child nodes, depending on the context it is compiled it.

To give you a bit of a feel of how those node types interact, I want to give a few examples of Perl 6 examples, and what AST they could produce. (It turns out that Perl 6 is quite a complex language under the hood, and usually produces a more complicated AST than the obvious one; I'll ignore that for now, in order to introduce you to the basics.)

Ops and Constants

The expression 23 + 42 could, in the simplest case, produce this AST:

QAST::Op.new(
    :op('add'),
    QAST::IVal.new(:value(23)),
    QAST::IVal.new(:value(42)),
);

Here an QAST::Op encodes a primitive operation, an addition of two numbers. The :op argument specifies which operation to use. The child nodes are two constants, both of type QAST::IVal, which hold the operands of the low-level operation add.

Now the low-level add operation is not polymorphic, it always adds two floating-point values, and the result is a floating-point value again. Since the arguments are integers and not floating point values, they are automatically converted to float first. That's not the desired semantics for Perl 6; actually the operator + is implemented as a subroutine of name &infix:<+>, so the real generated code is closer to

QAST::Op.new(
    :op('call'),
    :name('&infix:<+>'),    # name of the subroutine to call
    QAST::IVal.new(:value(23)),
    QAST::IVal.new(:value(42)),
);

Variables and Blocks

Using a variable is as simple as writing QAST::Var.new(:name('name-of-the-variable')), but it must be declared first. This is done with QAST::Var.new(:name('name-of-the-variable'), :decl('var'), :scope('lexical')).

But there is a slight caveat: in Perl 6 a variable is always scoped to a block. So while you can't ordinarily mention a variable prior to its declaration, there are indirect ways to achieve that (lookup by name, and eval(), to name just two).

So in Rakudo there is a convention to create QAST::Block nodes with two QAST::Stmts children. The first holds all the declarations, and the second all the actual code. That way all the declaration always come before the rest of the code.

So my $x = 42; say $x compiles to roughly this:

QAST::Block.new(
    QAST::Stmts.new(
        QAST::Var.new(:name('$x'), :decl('var'), :scope('lexical')),
    ),
    QAST::Stmts.new(
        QAST::Op.new(
            :op('p6store'),
            QAST::Var.new(:name('$x')),
            QAST::IVal.new(:value(42)),
        ),
        QAST::Op.new(
            :op('call'),
            :name('&say'),
            QAST::Var.new(:name('$x')),
        ),
    ),
);

Polymorphism and QAST::Want

Perl 6 distinguishes between native types and reference types. Native types are closer to the machine, and their type name is always lower case in Perl 6.

Integer literals are polymorphic in that they can be either a native int or a "boxed" reference type Int.

To model this in the AST, QAST::Want nodes can contain multiple child nodes. The compile-time context decides which of those is acutally used.

So the integer literal 42 actually produces not just a simple QAST::IVal node but rather this:

QAST::Want.new(
    QAST::WVal(Int.new(42)),
    'Ii',
    QAST::Ival(42),
)

(Note that Int.new(42) is just a nice notation to indicate a boxed integer object; it doesn't quite work like this in the code that translate Perl 6 source code into ASTs).

The first child of a QAST::Want node is the one used by default, if no other alternative matches. The comes a list where the elements with odd indexes are format specifications (here Ii for integers) and the elements at even-side indexes are the AST to use in that case.

An interesting format specification is 'v' for void context, which is always chosen when the return value from the current expression isn't used at all. In Perl 6 this is used to eagerly evaluate lazy lists that are used in void context, and for several optimizations.

[/perl-6] Permanent link

comments / trackbacks

Tue, 12 Feb 2013

Pattern Matching and Unpacking


Permanent link

When talking about pattern matching in the context of Perl 6, people usually think about regex or grammars. Those are indeed very powerful tools for pattern matching, but not the only one.

Another powerful tool for pattern matching and for unpacking data structures uses signatures.

Signatures are "just" argument lists:

sub repeat(Str $s, Int $count) {
    #     ^^^^^^^^^^^^^^^^^^^^  the signature
    # $s and $count are the parameters
    return $s x $count
}

Nearly all modern programming languages have signatures, so you might say: nothing special, move along. But there are two features that make them more useful than signatures in other languages.

The first is multi dispatch, which allows you to write several routines with the name, but with different signatures. While extremely powerful and helpful, I don't want to dwell on them. Look at Chapter 6 of the "Using Perl 6" book for more details.

The second feature is sub-signatures. It allows you to write a signature for a sigle parameter.

Which sounds pretty boring at first, but for example it allows you to do declarative validation of data structures. Perl 6 has no built-in type for an array where each slot must be of a specific but different type. But you can still check for that in a sub-signature

sub f(@array [Int, Str]) {
    say @array.join: ', ';
}
f [42, 'str'];      # 42, str
f [42, 23];         # Nominal type check failed for parameter '';
                    # expected Str but got Int instead in sub-signature
                    # of parameter @array

Here we have a parameter called @array, and it is followed by a square brackets, which introduce a sub-signature for an array. When calling the function, the array is checked against the signature (Int, Str), and so if the array doesn't contain of exactly one Int and one Str in this order, a type error is thrown.

The same mechanism can be used not only for validation, but also for unpacking, which means extracting some parts of the data structure. This simply works by using variables in the inner signature:

sub head(*@ [$head, *@]) {
    $head;
}
sub tail(*@ [$, *@tail]) {
    @tail;
}
say head <a b c >;      # a
say tail <a b c >;      # b c

Here the outer parameter is anonymous (the @), though it's entirely possible to use variables for both the inner and the outer parameter.

The anonymous parameter can even be omitted, and you can write sub tail( [$, *@tail] ) directly.

Sub-signatures are not limited to arrays. For working on arbitrary objects, you surround them with parenthesis instead of brackets, and use named parameters inside:

multi key-type ($ (Numeric :$key, *%)) { "Number" }
multi key-type ($ (Str     :$key, *%)) { "String" }
for (42 => 'a', 'b' => 42) -> $pair {
    say key-type $pair;
}
# Output:
# Number
# String

This works because the => constructs a Pair, which has a key and a value attribute. The named parameter :$key in the sub-signature extracts the attribute key.

You can build quite impressive things with this feature, for example red-black tree balancing based on multi dispatch and signature unpacking. (More verbose explanation of the code.) Most use cases aren't this impressive, but still it is very useful to have occasionally. Like for this small evaluator.

[/perl-6] Permanent link

comments / trackbacks

Mon, 31 Dec 2012

iPod nano 5g on linux -- works!


Permanent link

For Christmas I got an iPod nano (5th generation). Since I use only Linux on my home computers, I searched the Internet for how well it is supported by Linux-based tools. The results looked bleak, but they were mostly from 2009.

Now (December 2012) on my Debian/Wheezy system, it just worked.

The iPod nano 5g presents itself as an ordinary USB storage device, which you can mount without problems. However simply copying files on it won't make the iPod show those files in the play lists, because there is some meta data stored on the device that must be updated too.

There are several user-space programs that allow you to import and export music from and to the iPod, and update those meta data files as necessary. The first one I tried, gtkpod 2.1.2, worked fine.

Other user-space programs reputed to work with the iPod are rhythmbox and amarok (which both not only organize but also play music).

Although I don't think anything really depends on some particular versions here (except that you need a new enough version of gtkpod), here is what I used:

  • Architecture: amd64
  • Linux: 3.2.0-4-amd64 #1 SMP Debian 3.2.35-2
  • Userland: Debian GNU/Linux "Wheezy" (currently "testing")
  • gtkpod: 2.1.2-1

[/misc] Permanent link

comments / trackbacks

Thu, 23 Aug 2012

Correctness in Computer Programs and Mathematical Proofs


Permanent link

While reading On Proof and Progress in Mathematics by Fields Medal winner Bill Thurston (recently deceased I was sorry to hear), I came across this gem:

The standard of correctness and completeness necessary to get a computer program to work at all is a couple of orders of magnitude higher than the mathematical community’s standard of valid proofs. Nonetheless, large computer programs, even when they have been very carefully written and very carefully tested, always seem to have bugs.

I noticed that mathematicians are often sloppy about the scope of their symbols. Sometimes they use the same symbol for two different meanings, and you have to guess from context which on is meant.

This kind of sloppiness generally doesn't have an impact on the validity of the ideas that are communicated, as long as it's still understandable to the reader.

I guess on reason is that most mathematical publications still stick to one-letter symbol names, and there aren't that many letters in the alphabets that are generally accepted for usage (Latin, Greek, a few letters from Hebrew). And in the programming world we snort derisively at FORTRAN 77 that limited variable names to a length of 6 characters.

[/misc] Permanent link

comments / trackbacks

Sun, 19 Aug 2012

Quo Vadis Perl?


Permanent link

The last two days we had a gathering in town named Perl (yes, a place with that name exists). It's a lovely little town next to the borders to France and Luxembourg, and our meeting was titled "Perl Reunification Summit".

Sadly I only managed to arrive in Perl on Friday late in the night, so I missed the first day. Still it was totally worth it.

We tried to answer the question of how to make the Perl 5 and the Perl 6 community converge on a social level. While we haven't found the one true answer to that, we did find that discussing the future together, both on a technical and on a social level, already brought us closer together.

It was quite a touching moment when Merijn "Tux" Brand explained that he was skeptic of Perl 6 before the summit, and now sees it as the future.

We also concluded that copying API design is a good way to converge on a technical level. For example Perl 6's IO subsystem is in desperate need of a cohesive design. However none of the Perl 6 specification and the Rakudo development team has much experience in that area, and copying from successful Perl 5 modules is a viable approach here. Path::Class and IO::All (excluding the crazy parts) were mentioned as targets worth looking at.

There is now also an IRC channel to continue our discussions -- join #p6p5 on irc.perl.org if you are interested.

We also discussed ways to bring parallel programming to both perls. I missed most of the discussion, but did hear that one approach is to make easier to send other processes some serialized objects, and thus distribute work among several cores.

Patrick Michaud gave a short ad-hoc presentation on implicit parallelism in Perl 6. There are several constructs where the language allows parallel execution, for example for Hyper operators, junctions and feeds (think of feeds as UNIX pipes, but ones that allow passing of objects and not just strings). Rakudo doesn't implement any of them in parallel right now, because the Parrot Virtual Machine does not provide the necessary primitives yet.

Besides the "official" program, everybody used the time in meat space to discuss their favorite projects with everybody else. For example I took some time to discuss the future of doc.perl6.org with Patrick and Gabor Szabgab, and the relation to perl6maven with the latter. The Rakudo team (which was nearly completely present) also discussed several topics, and I was happy to talk about the relation between Rakudo and Parrot with Reini Urban.

Prior to the summit my expectations were quite vague. That's why it's hard for me to tell if we achieved what we and the organizers wanted. Time will tell, and we want to summarize the result in six to nine months. But I am certain that many participants have changed some of their views in positive ways, and left the summit with a warm, fuzzy feeling.

I am very grateful to have been invited to such a meeting, and enjoyed it greatly. Our host and organizers, Liz and Wendy, took care of all of our needs -- travel, food, drinks, space, wifi, accommodation, more food, entertainment, food for thought, you name it. Thank you very much!

Update: Follow the #p6p5 hash tag on twitter if you want to read more, I'm sure other participants will blog too.

Other blogs posts on this topic: PRS2012 – Perl5-Perl6 Reunification Summit by mdk and post-yapc by theorbtwo

[/perl-6] Permanent link

comments / trackbacks

Tue, 17 Jul 2012

Stop The Rewrites!


Permanent link

What follows is a rant. If you're not in the mood to read a rant right now, please stop and come back in an hour or two.

The Internet is full of people who know better than you how to manage your open source project, even if they only know some bits and pieces about it. News at 11.

But there is one particular instance of that advice that I hear often applied to Rakudo Perl 6: Stop the rewrites.

To be honest, I can fully understand the sentiment behind that advice. People see that it has taken us several years to get where we are now, and in their opinion, that's too long. And now we shouldn't waste our time with rewrites, but get the darn thing running already!

But Software development simply doesn't work that way. Especially not if your target is moving, as is Perl 6. (Ok, Perl 6 isn't moving that much anymore, but there are still areas we don't understand very well, so our current understanding of Perl 6 is a moving target).

At some point or another, you realize that with your current design, you can only pile workaround on top of workaround, and hope that the whole thing never collapses.

Picture of
a Jenga tower
Image courtesy of sermoa

Those people who spread the good advice to never do any major rewrites again, they never address what you should do when you face such a situation. Build the tower of workarounds even higher, and pray to Cthulhu that you can build it robust enough to support a whole stack of third-party modules?

Curiously this piece of advice occasionally comes from people who otherwise know a thing or two about software development methodology.

I should also add that since the famous "nom" switchover, which admittedly caused lots of fallout, we had three major rewrites of subsystems (longest-token matching of alternative, bounded serialization and qbootstrap), All three of which caused no new test failures, and two of which caused no fallout from the module ecosystem at all. In return, we have much faster startup (factor 3 to 4 faster) and a much more correct regex engine.

[/perl-6] Permanent link

comments / trackbacks