Categories

Posts in this category

Fri, 17 Jul 2009

Rakudo architectural overview


Permanent link

The other day Su-Shee asked on IRC if I could tell her which components of Rakudo are written in which programming language. So here it is:

Rakudo flow chart

The source code is entered at the top of the machine named Rakudo, and is transformed in various stages. The first two, parser and action methods, are actually part of Rakudo, and are hosted in the Rakudo repository. They are written in two different subsets of Perl 6, the regexes (parser), and "Not Quite Perl 6", short NQP (action methods).

The next two stages (PAST and POST compiler) are part of the so-called "Parrot Compiler Toolkit", short PCT. Both PAST and POST are structural representations of the program, with PAST being more high-level than POST. Both compilers are written in PIR, the parrot assembly language, and are distributed along with parrot. They are also used by many other parrot based languages.

The POST compiler emits PIR, which IMCC transforms into byte code. IMCC is parrot's PIR compiler, written in C and statically linked into parrot. The byte code (PBC) can then be stored to disk, or executed in memory by a so-called run core or run loop, which is in some sense the heart of parrot - or one of the hearts, because there are several different ones available (one for just-in-time compilation (JIT), one for debugging etc.).

There are also some supporting custom types and operations in Rakudo called dynamic PMCs and dynamic ops which are written in C, and helper functions written in other languages (namely NQP and PIR). Those do not show up in the flow chart.

The part of Rakudo described so far is the stage one compiler. In the build process it is compiled first, and then it compiles the setting library down to PBC. "Setting library" is a fancy term describing the built-in functions which are written in Perl 6. The result of this compilation (together with a low level runtime library in PIR) is linked together with the stage one compiler and parrot, the result is the perl6 executable.

Glossary

PGE
Parrot Grammar Engine, parrot's grammar engine for Perl 6 regexes and grammars.
NQP
Not Quite Perl 6, a small subset of Perl 6 that is used for tree transformations in compilers.
PAST
Parrot Abstract Syntax Tree, an in-memory representation of structures common to many programming languages (like variable declarations, branches, loops, subroutine calls).
POST
Parrot Opcode Syntax Tree, an in-memory low level representation of programs.
PCT
Parrot Compiler Toolkit, a collection of tools and compilers useful for writing other compilers.
PIR
Parrot Intermediate Representation, the most commonly used for of parrot assembly (which is still high-level enough to be written by humans).
IMCC
InterMediate Code Compiler, the part of parrot that compiles PIR into byte code.
PBC
Parrot Byte Code, the binary form to which all parrot programs are compiled in the end.

[/perl-6] Permanent link