This section introduces a self-paced exercise to extend the language with structs. The subsections only describe the requirements and the theoretical background needed for every step, figuring out the implementation details is up to the reader. Imagine that this is a real-world scenario of working as a compiler engineer and getting a task to extend the language based on certain criteria.
One possible solution to this exercise is already the part of the compiler and can be found on GitHub.
The motivation behind structs is to pack values together and make it more convenient to pass them around simultaneously, as well as to make the source code more expressive.
fn rgbToGray(r: number, g: number, b: number): number {
return (r + g + b) / 3;
}
Maintaining 3 different variables to store each individual color channel value can become overwhelming quickly and also affects readability and maintainability. Expressing the same logic with structs makes it more readable and maintainable.
struct RGBColor {
r: number,
g: number,
b: number,
}
fn rgbToGray(c: RGBColor): number {
return (c.r + c.g + c.b) / 3;
}
Whether extending the compiler from the lexer to code generation seems overwhelming at first, or it looks like a walk in the park, doing a small task to get used to working on the codebase is always a good idea.
In the current version of the language, parameters are
passed by value, can only be of
number
type and are always immutable. With
struct
parameters however, there can be use
cases when only one or two fields of the struct need to be
changed within the body of the function. With immutable
parameters, the only way to achieve this is to duplicate the
struct
into a local variable and then change
the fields of this variable.
fn removeGreenChannel(c: RGBColor): RGBColor {
var greenRemoved = c;
greenRemoved.g = 0;
return greenRemoved;
}
Design Note
The question might arise, why structs are passed by value like in C or Rust instead of passing only a reference to the struct like in Kotlin or Swift.
Every object has a lifetime, thus every piece of memory that was once allocated must be freed eventually. In the worst case scenario the memory is freed when the program exits, unless the program is designed to be running infinitely like a web server. In that case if unused memory is not freed somehow, the program can quickly run out of memory.
With stack-allocated variables managing lifetimes is straightforward. When the function returns and the stack space is destroyed, the object is considered to be freed.
fn returnStruct(): S { let s = S { ... }; return s; }
Imagine that structs are moved around by the reference pointing to them. That means, in this example a reference to
s
is returned from the function, however, ifs
is on the stack, it is freed after the function returns. Where does the reference point then? It points to the piece of memory that used to stores
, but is now freed, so the reference is dangling, pointing to an unknown piece of memory. Moving structs around by value avoids this issue automatically.But what about those languages, that have been mentioned to work with references? How can they do that? To understand this, 2 questions need to be explored first. How the memory is allocated and how it is freed.
The first one is straightforward, in these languages the memory for the struct is allocated on the heap upon instantiation. Allocating it on the stack would result in the dangling reference problem discussed previously.
For the second question different languages propose different solutions. In C heap-allocated objects are freed manually by calling the
free()
function, however, this requires the developer to keep track of which block of memory needs to be freed and which has already been freed, so there are lots of possibilities to make a mistake at some point. To reduce the risk of memory corruption other languages such as Kotlin or Swift use solutions that automatically keep track of which memory blocks need to be freed and kept alive.Kotlin runs on the JVM, which uses tracing garbage collection (GC), where a utility known as the garbage collector keeps track of which pieces of memory can be accessed through the available references at a specific point of the program from a so-called root object. When a reference goes out of scope and no other accessible references point to the chunk of memory in question, it is considered unreachable, otherwise, it is reachable. The garbage collector periodically scans all the allocated memory blocks and frees those that are unreachable.
1.) an A object is created ┌─────────┐ │ RootObj │ 6 | fun main() { └─────────┘ -> 7 | val a = A() │ 8 | wrapper(a) v 9 | } ┌─────┐ │ A@7 │ └─────┘ 2.) wrapper is called and a B object is created ┌─────────┐ 1 | fun wrapper(a: A) { │ RootObj │ -> 2 | var b = B() └─────────┘ 3 | b = B() ┌────┴────┐ 4 | a.b = b v v 5 | } ┌─────┐ ┌─────┐ │ A@7 │ │ B@2 │ └─────┘ └─────┘ 3.) the same reference is pointed to another B object ┌─────────┐ 1 | fun wrapper(a: A) { │ RootObj │ 2 | var b = B() └─────────┘ -> 3 | b = B() ┌───────┴───────┐ 4 | a.b = b v v 5 | } ┌─────┐ ┌─────┐ ┌─────┐ │ A@7 │ │ B@2 │ │ B@3 │ └─────┘ └─────┘ └─────┘ 4.) a reference in A@7 is pointed to B@3 ┌─────────┐ │ RootObj │ 1 | fun wrapper(a: A) { └─────────┘ 2 | var b = B() ┌───────┴───────┐ 3 | b = B() v v -> 4 | a.b = b ┌─────┐ ┌─────┐ ┌─────┐ 5 | } │ A@7 │ │ B@2 │ │ B@3 │ └─────┘ └─────┘ └─────┘ │ ^ └───────────────┘ 5.) return and the reference to B@3 goes out of scope ┌─────────┐ │ RootObj │ └─────────┘ 6 | fun main() { ┌───────┘ 7 | val a = A() v -> 8 | wrapper(a) ┌─────┐ ┌─────┐ ┌─────┐ 9 | } │ A@7 │ │ B@2 │ │ B@3 │ └─────┘ └─────┘ └─────┘ │ ^ └───────────────┘ 6.) execution halts, the GC marks the reachable nodes ╔────═────╗ │ RootObj │ ╚────═────╝ 6 | fun main() { ┌───────┘ 7 | val a = A() v -> 8 | wrapper(a) ╔──═──╗ ┌─────┐ ╔──═──╗ 9 | } │ A@7 │ │ B@2 │ │ B@3 │ ╚──═──╝ └─────┘ ╚──═──╝ │ ^ └───────────────┘ 7.) any unmarked node is freed ┌─────────┐ │ RootObj │ 6 | fun main() { └─────────┘ 7 | val a = A() ┌────┘ -> 8 | wrapper(a) v 9 | } ┌─────┐ ┌─────┐ │ A@7 │──>│ B@3 │ └─────┘ └─────┘ 8.) the execution continues ...
Swift uses a different technique called automatic reference counting (ARC). In this case, the objects themselves have a counter attached to them, which tracks how many reachable references point to the object. When a reference goes out of scope, the counter is decremented, and if the reference count reaches 0, the object is freed.
1.) callee is entered and an A class is instantiated 4 | func callee() { ┌─────┐ ┌───────────┐ -> 5 | let a = A() │ A@5 │ ─> │ RefCnt: 1 │ 6 | wrapper(a: a) └─────┘ └───────────┘ 7 | } 2.) wrapper is entered, a local reference is created 1 | func wrapper(a: A) { ┌─────┐ ┌───────────┐ -> 2 | let b = a │ A@5 │ ─> │ RefCnt: 2 │ 3 | } └─────┘ └───────────┘ 3.) return and the local reference goes out of scope 4 | func callee() { ┌─────┐ ┌───────────┐ 5 | let a = A() │ A@5 │ ─> │ RefCnt: 1 │ -> 6 | wrapper(a: a) └─────┘ └───────────┘ 7 | } 4.) callee returns, the reference count reaches zero ┌─────┐ ┌───────────┐ -> 8 | callee() │ A@5 │ ─> │ RefCnt: 0 │ └─────┘ └───────────┘ 5.) the object is freed -> 8 | callee()
Both GC and ARC execute additional code in runtime, so they impose some performance overhead and require additional logic to be performed by the compiler during code generation. And of course, they are also a lot more complex than these oversimplified examples make it seem.
Because the struct
is passed by value, no
modification to it would be visible to the caller anyway, so
copying it to an additional local variable doesn't make
sense. To resolve this inconvenience, it's time to introduce
support for mutable parameters, which are marked with the
var
keyword. Using the var
to
denote parameter mutability keeps the language consistent
with local variable declarations where var
is
also used to express mutability.
fn removeGreenChannel(var c: RGBColor): RGBColor {
c.g = 0;
return c;
}
Design Note
The
var
parameter in it's current form also used to be the part of Kotlin and Swift in their early days, but in the case of both languages the language designers decided to remove it eventually.On the other hand, Rust still has the same concept, except for
mut
being written instead ofvar
.fn rust_mutable_param(mut c: RGBColor) { ... }
The key difference between the mentioned languages is how object parameters are passed to functions. In Rust the instance of the struct is passed by value, but in case of Kotlin and Swift classes, it's the reference to the instance that gets passed by value, like passing a pointer to the instance in C/C++. Swift is not so straightforward though, as it also supports structs, not just classes and structs are passed by value like in Rust.
Anyway, in case of Swift
var
parameters caused a confusion because developers expected that by making the parameter mutable, the mutation will be visible outside the function, but Swift had theinout
keyword for that andvar
only allowed the mutation to be visible within the context of the function. The full proposal for removingvar
parameters is available on GitHub.Apparently in Kotlin,
var
caused the same confusion that it did in Swift. In Kotlin it's always the reference to the object that is passed by value and by making the reference mutable and assigning a new value to it, one might think it wasn't the local copy of the reference pointed to a new object but the object pointed by the reference replaced with a new one, so the new value will be visible outside the function. For more details see the blog post about the change.In both Rust and your language the instances of structs are passed by value, not by references pointing to them, so the mutability of the parameter doesn't suggest that the change is visible in the caller too.
To implement this simple, but convenient change proceed with the following few steps.
Update the parser so that it accepts the following
paramDecl
grammar extension.
<paramDecl>
::= 'var'? <identifier> ':' <type>
Don't forget to store whether the parameter is
mutable or not in the AST similar to
VarDecl
, as the semantic analyzer needs
this information to ensure that only the allowed
operations are performed on the value.
In the semantic analyzer ensure that the parameter
declaration is resolved properly and modifying an
immutable parameter still results in an error, but
changing a mutable parameter is allowed. The
parameter of the built-in
println
should remain immutable.
fn paramSema(x: number, var y: number): void {
y = 0;
x = 1;
^ error: 'x' cannot be mutated
}
The mutation of the value is checked in the one and
only dataflow analysis pass, which expects the LHS
of an assignment to be a
ResolvedVarDecl
, however this is no
longer the case. To make modifying this pass and
later the code generation easier, lift the
isMutable
field up from
ResolvedVarDecl
to the
ResolvedDecl
base class.
struct ResolvedDecl {
...
bool isMutable;
...
};
This property should automatically be set to
false
for every resolved declaration
except for variables and parameters, for which its
value should be deduced from the AST. Note that this
change also automatically forwards the mutability of
the parameter to Codegen
, where it can
be used for more efficient LLVM IR generation.
Remember to update the dataflow analysis pass, so
that lattices also store the now updated
ResolvedDecl
instead of the previous
ResolvedVarDecl
.
using Lattice = std::map<const ResolvedDecl *, State>;
During LLVM IR generation, in the body of the function, a local variable is introduced for each parameter, and their values are copied into these variables. When the value of the parameter is read, LLVM loads the corresponding local variable and uses its value for further computation.
define void @varParam(double %n) {
entry:
%n1 = alloca double, align 8
store double %n, double* %n1, align 8
%0 = load double, double* %n1, align 8
call void @println(double %0)
ret void
}
Notice how the function in the example is called
varParam()
. This is not a coincidence,
because in truth this local variable only needs to
be introduced if the parameter is mutable, as its
sole purpose is to keep the changes to the parameter
visible inside the given function only.
The language is designed in such a way that the
semantic analyzer can ensure that every code
modifying an immutable parameter is rejected before
code generation. This allows Codegen
to
take advantage of this information and generate more
optimized LLVM IR by omitting the local variable
creation for immutable parameters.
define void @nonVarParam(double %n) {
entry:
call void @println(double %n)
ret void
}
Design Note
When taking a look at how Clang generates code for C++ functions, it can be observed that the local variable is always present even in cases when the parameter is of
const
type.There are mainly two reasons for this. The first is that constness can be cast away, so the parameter can still be mutated and the second is the
&
operator.void immutableParameter(const int x) { *const_cast<int *>(&x) = 2; }
The
&
operator gets the address of the parameter, but to do that, it must first allocate memory for it, returning the memory address of a value in a register is not possible.The same operator is also present in Rust, but the Rust compiler only allocates the variable if it is used. In a function, where the parameter is not borrowed, it's not copied into a local variable either.
Update Codegen
so that a local variable
is no longer introduced for immutable parameters.
Also, ensure that when a
ResolvedDeclRefExpr
points to such a
parameter, its value is no longer loaded. The
generated LLVM IR for mutable parameters should
remain unchanged.
The first bigger step towards supporting structs in the language is correctly recognizing their syntax, which means teaching it to the parser. This section describes how the syntax was designed and explains every step to take to end up with the parser correctly recognizing it.
Notice how the example in the
Motivation
sections already spoiled all
the new tokens that the lexer must recognize. These
are the struct
keyword and the
.
when a field is accessed. Let's
update the lexer so that it produces
TokenKind::KwStruct
and
TokenKind::Dot
, when it sees them.
The introduction of structs changes the top level of the source file from a list of function declarations to a list of function and struct declarations.
<sourceFile>
::= (<structDecl> | <functionDecl>)* EOF
To be able to see the changes and visualize the AST
dump as soon as possible, make sure the parser
returns a list of Decl
instead of the
current list of FunctionDecl
as the
AST. This makes the -ast-dump
flag work
and print out the full AST for better debugging
experience.
std::pair<std::vector<std::unique_ptr<Decl>>, bool> parseSourceFile();
The AST should preserve the declaration order in the source code, so if a struct is declared before a function in the source code, it should also precede it in the AST.
Note that after changing the type of the returned
AST in the parser, the driver will not compile
because the next step in the compilation pipeline
still expects a list of FunctionDecl
.
To make the compiler still produce an executable
from every already supported source code, introduce
a temporary adapter in the driver that extracts the
function declarations from the AST and only passes
those further down the pipeline. This adapter will
be removed at the beginning of the next section.
The first new element to recognize is the struct
declaration, which as mentioned appears on the top
level and only on the top level, so nested struct
declarations are not allowed. It starts with the
struct
keyword followed by a
{}
enclosed list of zero or many field
declarations.
struct S { ... }
A field declaration is a :
separated
pair of an identifier and a type. In the body of a
struct these declarations are separated by commas
with an optional trailing comma at the end.
struct S {
field: type,
field: type,
field: type,
}
By adding the trailing comma, the
struct
can also be declared
conveniently on a single line only.
struct S { field: type, field: type, field: type }
This syntax leads to the following extension of the grammar.
<structDecl>
::= 'struct' <identifier> <fieldList>
<fieldList>
::= '{' (<fieldDecl> (',' <fieldDecl>)* ','?)? '}'
<fieldDecl>
::= <identifier> ':' <type>
Update the parser so that it recognizes this grammar extension and produces the following AST nodes. Note that the constructors have been omitted from the example, but they need to be present in the source code.
struct FieldDecl : public Decl {
Type type;
...
void dump(size_t level = 0) const override {
std::cerr << indent(level) << "FieldDecl: "
<< identifier << ':' << type.name
<< '\n';
}
};
struct StructDecl : public Decl {
std::vector<std::unique_ptr<FieldDecl>> fields;
...
void dump(size_t level = 0) const override {
std::cerr << indent(level) << "StructDecl: "
<< identifier << '\n';
for (auto &&field : fields)
field->dump(level + 1);
}
};
The syntax of instantiating a struct is almost
identical to declaring it except for the missing
struct
keyword in front of the
identifier and that in the {}
enclosed
list of fields, the :
is not followed
by a type, but an expression that sets the value of
that field, thus ensuring that every field is
initialized and making it clear which field has been
initialized to what value.
S {
field: expr,
field: expr,
field: expr,
}
The struct instantiation can also be written on a single line conveniently because of the optional trailing comma at the end of the field initializer list.
S { field: expr, field: expr, field: expr }
Instantiating a struct is considered a primary expression, so the extended grammar looks like the one below.
<primaryExpression>
::= ...
| <structInstantiation>
| <declRefExpr>
| ...
<structInstantiation>
::= <identifier> <fieldInitList>
<fieldInitList>
::= '{' (<fieldInit> (',' <fieldInit>)* ','?)? '}'
<fieldInit>
::= <identifier> ':' <expr>
Notice how structInstantiation
is
listed before declRefExpr
. Each of
these expressions start with an
identifier
, but a
structInstantiation
has a priority over
a declRefExpr
, which ensures that if an
expression can be parsed as a struct instantiation,
it is always parsed as one, instead of parsing a
declRefExpr
and erroring out on the
{
token.
Error { ... };
^
error: expected ';' at the end of expression
Update the parser so that it correctly recognizes the new primary expression and produces the following AST nodes. The constructors are once again omitted.
struct FieldInitStmt : public Stmt {
std::string identifier;
std::unique_ptr<Expr> initializer;
...
void dump(size_t level = 0) const override {
std::cerr << indent(level)
<< "FieldInitStmt: " << identifier
<< '\n';
initializer->dump(level + 1);
}
};
struct StructInstantiationExpr : public Expr {
std::string identifier;
std::vector<std::unique_ptr<FieldInitStmt>>
fieldInitializers;
...
void dump(size_t level = 0) const override {
std::cerr << indent(level)
<< "StructInstantiationExpr: "
<< identifier << '\n';
for (auto &&field : fieldInitializers)
field->dump(level + 1);
}
};
Notice how the struct instantiation syntax
introduces an ambiguity in the condition of an
if
or while
statement.
if condition { ... }
Because of the grammar modification, this would be
parsed as an if
followed by a
structInstantiation
with a missing
body.
structInstantiation
┌───────┴───────┐
if condition { ... } ...
^ error: expected 'if' body
To resolve this ambiguity, a
structInstantiation
is not allowed to
be placed inside a condition and the parser should
ensure that the above example is always parsed as a
declRefExpr
followed by a block.
IfStmt
DeclRefExpr: condition
Block
If for any reason a struct needs to be instantiated in a condition, it can be done by wrapping it in a grouping expression, which makes it clear, where the body of the function starts.
if (identifier {}) {}
Implement a logic similar to the Rust parser, that makes it possible to apply certain restrictions when an AST node is parsed.
When an if
or
while
condition is parsed, apply a
restriction that tells the parser to ignore struct
instantiations. Upon entering a grouping expression,
every restriction that is currently applied, should
be cleared until the wrapped expression is consumed.
Statement: if (condition {}) {}
| parseStmt() [NoRestrictions]
| parseIfStmt()
| parseExpr() [StructNotAllowed]
| ...
| parseGroupingExpr()
| parseExpr() [NoRestrictions]
| ...
| parseExpr()
| parseGroupingExpr() [StructNotAllowed]
| ...
| parseExpr()
| parseBlock() [NoRestrictions]
| ...
After declaring and instantiating structs, the final syntactical extension makes it possible to access their fields.
Since a struct can have fields that are also structs, any arbitrarily deep field can be read or written. A struct can also be returned from a function and reading or assigning the fields of the returned struct is also allowed, which means all the following expressions should be accepted.
s.f1.f2.f3
foo().f1.f2
On the other hand, a struct cannot have function fields or methods, so calling an accessed field is not possible.
not.possible()
The restriction that a call expression, if present,
must be the first expression in the chain can be
enforced in the parser by modifying
postfixExpression
the way below.
<postfixExpression>
::= <primaryExpression> <argumentList>? <memberExpr>*
<memberExpr>
::= '.' <identifier>
Update the parser so that it correctly consumes member expressions and creates the following node for them. Like in the previous examples, the constructors are still omitted.
struct MemberExpr : public Expr {
std::unique_ptr<Expr> base;
std::string member;
...
void dump(size_t level = 0) const override {
std::cerr << indent(level) << "MemberExpr: ."
<< member << '\n';
base->dump(level + 1);
}
};
The previous point mentioned that a
MemberExpr
can also be used to assign a
value to a field, however an assignment statement is
still restricted so that it only accepts a
DeclRefExpr
on its left-hand side, thus
getting the following code rejected.
s.f = 1;
Making a field assignable while still maintaining a strict restriction on the LHS of the assignment can be achieved with another simple grammar modification.
<assignment>
::= (<declRefExpr> | <memberExpr>) '=' <expr> ';'
To model the grammar change in the AST, introduce an
AssignableExpr
placeholder node and
make it the base class of
DeclRefExpr
and
MemberExpr
. The
Assignment
node should also store this
node instead of a DeclRefExpr
.
struct AssignableExpr : public Expr { ... };
struct DeclRefExpr : public AssignableExpr { ... };
struct MemberExpr : public AssignableExpr { ... };
struct Assignment : public Stmt {
std::unique_ptr<AssignableExpr> assignee;
...
};
Update the parser so that it also accepts a
MemberExpr
on the LHS of an assignment.
Remember that the semantic analyzer still expects
the LHS to be a DeclRefExpr
, which will
be fixed in the next section. Until then, introduce
a temporary workaround in
Sema::resolveAssignment()
to keep the
compiler functioning.
Tweak the error recovery mechanism, so that
synchronization happens on the
struct
keyword too. The
synchronizeOn()
utility should also be
updated to accept a list of tokens instead of a
single token only, as on the top level both
struct
and fn
is a
synchronization point and synchronization should
happen on the keyword that is encountered first.
fail struct ... fn ...
┌─────┐ ┌────────┐ ┌─────┐ ┌────┐ ┌─────┐
│ ... │ │ struct │ │ ... │ │ fn │ │ ... │
└─────┘ └────────┘ └─────┘ └────┘ └─────┘
^ ^ ^
| └ synchronize here └ don't skip to here
└ fail here
If a failure happens inside a struct instantiation,
the parser should manually synchronize on the next
}
token instead of waiting for an
automatic synchronization on the block level, as the
parser might think the }
is the end of
the statement, however, that is not always the case.
S { f1: fail, f2: 0 }
┌─────┐ ┌───┐ ┌────┐ ┌───┐ ┌───┐ ┌───┐
│ ... │ │ , │ │ f2 │ │ : │ │ 0 │ │ } │
└─────┘ └───┘ └────┘ └───┘ └───┘ └───┘
^ ^
└ fail here └ sync here
After structs are recognized by the compiler, the next step is to ensure that they are used correctly. This section explains how to extend the semantic analyzer to detect the invalid use of structs and ensure there are no issues in the program.
The parser now returns a list of
Decl
as the AST, which is passed to the
semantic analyzer through an adapter in the driver.
Remove this adapter from the driver and update
Sema
so that it takes a list of
Decl
and returns a list of
ResolvedDecl
as the new resolved tree.
class Sema {
...
public:
explicit Sema(std::vector<std::unique_ptr<Decl>> ast)
: ast(std::move(ast)) {}
std::vector<std::unique_ptr<ResolvedDecl>> resolveAST();
};
Ensure that the -cfg-dump
flag still
works as expected. Only functions can have a control
flow graph, so the driver should only run
CFGBuilder
on a
ResolvedFunctionDecl
.
Also Codegen
still expects a list of
ResolvedFunctionDecl
as the resolved
tree. To keep it working, introduce another
temporary adapter similar to the just removed one
between Sema
and Codegen
.
Extend Type
with an additional
structType
. The kind
field
should store a new Struct
kind and the
name
field should store the identifier
of the struct.
static Type structType(const std::string &id) { return {Kind::Struct, id}; }
Two struct types are considered equal if their
identifiers are equal. Ensure in the semantic
analyzer that types are compared by their
name
field instead of their
kind
field, where it's applicable.
Structs are declared on the top level and are location independent, which means that a struct doesn't have to be declared before it can be used. Due to this property, they are also resolved in multiple passes just like functions.
They also need to be resolved before function declarations because a function can have a struct parameter, or it can return a struct and in these cases, the compiler needs to make sure that the struct is declared in the source code.
fn returnS(): S { ... }
^ 'S' must be found when resolving
the function declaration
struct S { ... }
Implement the first pass of struct resolution that results in the following nodes.
struct ResolvedFieldDecl : public ResolvedDecl {
unsigned index;
...
void dump(size_t level = 0) const override {
std::cerr << indent(level)
<< "ResolvedFieldDecl: @("
<< this << ") " << identifier
<< '\n';
}
};
struct ResolvedStructDecl : public ResolvedDecl {
std::vector<std::unique_ptr<ResolvedFieldDecl>>
fields;
...
void dump(size_t level = 0) const override {
std::cerr << indent(level)
<< "ResolvedStructDecl: @("
<< this << ") " << identifier
<< ':' << '\n';
for (auto &&field : fields)
field->dump(level + 1);
}
};
Notice how ResolvedFieldDecl
also
stores the index of itself within the struct. This
will be important during code generation.
This pass should only check if each field declaration is unique and insert the structs to the symbol table. Don't resolve the type of the field declarations yet.
struct S {
f: number,
f: number,
^ error: field 'f' is already declared
}
To make code generation easier ensure that in the
resolved tree every
ResolvedStructDecl
precedes any
ResolvedFunctionDecl
. In other words,
the resolved tree should have the following
structure.
ResolvedStructDecl
...
ResolvedStructDecl
ResolvedFunctionDecl
...
ResolvedFunctionDecl
To resolve a Type::Kind::Custom
type to
a Type::Kind::Struct
type, the semantic
analyzer needs to check whether the identifier of
the custom type references a known
ResolvedStructDecl
.
The current lookup logic only supports looking up
ResolvedDecl
based on its identifier,
which can lead to cases when the wrong declaration
is found.
let S = 1;
let x: S;
^ here it needs to be checked if this is an
existing struct, but looking up 'S' by
its identifier returns the variable above
Update the symbol lookup logic, so that the type of the searched declaration can also be specified and instead of returning the first declaration with the same name, it returns the first declaration with a matching type as well.
template <typename T>
std::pair<T *, int> Sema::lookupDecl(const std::string id) { ... }
Use the updated lookup logic to resolve a
Type::Kind::Custom
. This change should
ensure that the following snippet is compiled
properly.
struct S {}
fn main(): void {
let S = 1;
let x: S;
^ struct
println(S);
^ local variable
}
With the modified declaration lookup logic, the
bodies of the struct declarations can also be
resolved. The second pass of
StructDecl
resolution should happen
after the first pass of function resolution.
Structs are stored on the stack, passed to and returned from functions by value. To allocate a struct, its size must be known at compile time. Because a struct is a list of values grouped together, its size is equivalent to the sum of the sizes of its fields.
struct Color { Stack:
r: number, r: size(number) ┐
g: number, g: size(number) ├ (+) size(Color)
b: number, b: size(number) ┘
}
A struct cannot have a field of itself as in that case to compute the size of the struct, the compiler has to know it already, but to know it, it has to compute it first.
struct S {
self: S size(S) = size(self) = size(S) = ...
}
The semantic analyzer also has to catch the case, when the struct contains itself through one or more struct fields.
struct S { x: S2 }
struct S2 { y: S }
size(S) = size(x) = size(S2) = size(y) = size(S) = ...
A struct cannot have void
fields
either, as there is no way to create a value of
void
type.
struct VoidField {
f: void,
^ error: field cannot be void
}
Implement a logic that resolves the fields of the
structs, detects void
fields and self
containing structs.
struct S { s2: S2 }
^ error: struct 'S' contains itself
struct S2 { s3: S3 }
^ error: struct 'S2' contains itself
struct S3 { s: S }
^ error: struct 'S3' contains itself
If a field type cannot be resolved, it should raise an error.
struct S {
e: Error
^ error: unable to resolve 'Error' type of field
}
A StructInstantiationExpr
as its name
suggests, is only allowed to instantiate a
struct
. Instantiating anything else
should result in an error.
let S = 1;
S { ... };
^ error: 'S' is not a struct type
Undefined { ... };
^ error: 'Undefined' is not a struct type
Upon instantiating a struct, every field must be initialized, but only declared fields can be initialized. Also, the same field cannot be initialized more than once.
struct Color {
r: number,
g: number,
b: number,
}
fn main(): void {
Color {
^ error: field 'b' is not initialized
r: 255,
r: 255,
^ error: field 'r' is already initialized
g: 255,
a: 0,
^ error: 'Color' has no field named 'a'
};
}
The type of the initializer must match the type of the field.
struct Wrapper { c: Color }
fn main(): void {
Wrapper {
^ error: field 'c' is not initialized
c: 0
^ error: 'number' cannot be used to
initialize a field of type 'Color'
};
}
Last but not least, a
ResolvedStructDecl
cannot be referenced
on its own. In such cases, it's safe to assume, the
developer forgot to instantiate it.
let s = Color;
^ error: expected an instance of 'Color'
Update the semantic analyzer so that all of the mentioned use cases are recognized and their correctness is checked. The resolved nodes to produce, with their constructors omitted can be found below.
struct ResolvedFieldInitStmt : public ResolvedStmt {
const ResolvedFieldDecl *field;
std::unique_ptr<ResolvedExpr> initializer;
...
void dump(size_t level = 0) const override {
std::cerr << indent(level)
<< "ResolvedFieldInitStmt: @("
<< field << ") " << field->identifier
<< '\n';
initializer->dump(level + 1);
}
};
struct ResolvedStructInstantiationExpr : public ResolvedExpr {
const ResolvedStructDecl *structDecl;
std::vector<std::unique_ptr<ResolvedFieldInitStmt>>
fieldInitializers;
...
void dump(size_t level = 0) const override {
std::cerr << indent(level)
<< "ResolvedStructInstantiationExpr: @("
<< structDecl << ')' << '\n';
for (auto &&fieldInit : fieldInitializers)
fieldInit->dump(level + 1);
}
};
The order of the field initializers should be the same as they are written in the source code. Also don't forget to run the constant expression evaluator on the initializer expressions.
ResolvedStructInstantiationExpr: @(...)
ResolvedFieldInitStmt: @(...) r
ResolvedNumberLiteral: '1'
| value: 1
ResolvedFieldInitStmt: @(...) g
ResolvedNumberLiteral: '2'
| value: 2
ResolvedFieldInitStmt: @(...) b
ResolvedNumberLiteral: '3'
| value: 3
The semantic of a MemberExpr
is
straightforward. It is only allowed to access the
existing fields of a struct instance.
fn main(): void {
let n = 0;
n.x;
^ error: cannot access field of 'number'
let c = Color { r: 1, g: 2, b: 3 };
c.y;
^ error: 'Color' has no field called 'y'
}
To make the semantic analyzer fully support the new AST nodes, make sure this expression is also resolved to the node below.
struct ResolvedMemberExpr : public ResolvedAssignableExpr {
std::unique_ptr<ResolvedExpr> base;
const ResolvedFieldDecl *field;
...
void dump(size_t level = 0) const override {
std::cerr << indent(level)
<< "ResolvedMemberExpr: @("
<< field << ") "
<< field->identifier << '\n';
base->dump(level + 1);
}
};
In the AST there is a restriction on what kind of
expressions can be found on the LHS of an
assignment. Propagate this restriction to the
resolved tree too by using the corresponding
resolved tree nodes and updating Sema
,
so that these nodes are produced correctly.
struct ResolvedAssignableExpr : public ResolvedExpr { ... };
struct ResolvedDeclRefExpr : public ResolvedAssignableExpr { ... };
struct ResolvedMemberExpr : public ResolvedAssignableExpr { ... };
struct ResolvedAssignment : public ResolvedStmt {
std::unique_ptr<ResolvedAssignableExpr> assignee;
...
};
Similar to how this change modified the API of the
AST in the parser, it modifies the API of the
resolved tree here, which affects
Codegen
directly. To keep the resolved
tree compatible with the current code generation
logic, introduce a similar workaround to what was
added to Sema
in the previous section,
but this time in
Codegen::generateAssignment()
.
Because ResolvedMemberExpr
can appear
on the LHS of an assignment, the statement can now
have side effects on both sides. To set an order of
evaluation, the RHS of the assignment is evaluated
before the LHS. Make sure this is also reflected in
the resolution process. Resolve the RHS before the
LHS.
Design Note
The evaluation order of assignments is also something that differs in every language. Kotlin evaluates the LHS first, while in C++ it's always the RHS first. Finally, there is C, which doesn't specify an order, it can be either of the two possibilities.
Still, explicitly specifying an evaluation order lets the developers know in what order they should expect side effects if there are any.
The ResolvedMemberExpr
on the LHS
should also be part of the CFG, but only this node,
the ResolvedDeclRefExpr
still shouldn't
be added. The member expression can be applied to
any primary expression, so make sure to add the
ResolvedStructInstantiationExpr
to the
CFG too.
fn foo(): void {
S{ x: 2 }.x = 1;
let n: number;
n = 3;
}
foo:
[2 (entry)]
preds:
succs: 1
[1]
preds: 2
succs: 0
ResolvedNumberLiteral: 1
ResolvedNumberLiteral: 2
ResolvedStructInstantiationExpr: S{ x: 2 }
ResolvedMemberExpr: S{ x: 2 }.x
ResolvedAssignment: S{ x: 2 }.x = 1
ResolvedDeclStmt: n
ResolvedNumberLiteral: 3
ResolvedAssignment: n = 3
[0 (exit)]
preds: 1
succs:
During variable initialization analysis, a struct is only considered initialized, if all of its members have been initialized in the same statement. Memberwise initialization is not supported.
fn structInitAnalysis(): void {
let initialized = S { x: 1 };
initialized.x;
let lateInit: S;
lateInit = S { x: 1 };
lateInit.x;
let uninit: S;
uninit.x = 1;
^ error: 'uninit' is not initialized
}
The mutability of a field is defined by the mutability of the base declaration. Temporary structs are always mutable.
fn structMutability(): void {
S { x: 1 }.x = 1;
var mutable = S { x: 1 };
mutable.x = 2;
let immutable = S { x: 1 };
immutable.x = 1;
^ error: 'immutable' cannot be mutated
}
At this point, structs can be parsed and validated by the semantic analyzer. This section introduces how to work with them in LLVM IR and see all the previous changes put into a platform-specific executable.
Before getting started with code generation make
sure to update Codegen
to accept a list
of ResolvedDecl
as the resolved tree
instead of the current list of
ResolvedFunctionDecl
and remove the
temporary adapter from the driver that was
introduced in the previous section.
The first step towards seeing the new syntactical changes in action is to represent each resolved struct as a struct type in LLVM IR.
LLVM supports two kinds of structs, identified and literal ones. A literal struct comes in handy for languages that have built-in support for working with tuple types, or when implementing platform-specific optimizations. See the design note on that at the end of this section.
fn returnTuple(): (number, number) { ... }
The snippet above could be represented by the
following piece of LLVM IR, where
{ double, double }
is a literal struct
of two double
members.
define { double, double } @returnTuple() { ... }
Identified structs on the other hand are useful to represent a concrete pack of values, like user-defined structs or classes.
struct Color { r: number, g: number, b: number }
The Color
struct in LLVM IR would be an
identified struct of three
double
fields.
%struct.Color = type { double, double, double }
It's important to note that while literal structs
are structurally unique, meaning that every
{ double, double }
in the IR refers to
the same type, identified structs are always
different, so even if the IR contains
%struct.a = type { double, double }
and
%struct.b = type { double, double }
,
%struct.a
is not the same type as
%struct.b
.
An identified struct can also be opaque, which tells LLVM that the struct exists, but it is not known what fields it has, like a forward declaration in C or C++.
In this step every
ResolvedStructDecl
should be
represented with an opaque identified
llvm::StructType
. Every time a struct
type is referenced, it's this struct type that
should be returned, don't generate a new
llvm::StructType
multiple times for the
same struct.
%struct.Color = type { double, double, double }
define void @allocateColor() {
entry:
%c = alloca %struct.Color, align 8
ret void
}
Similar to what happens in the semantic analyzer,
first, a type for every struct declaration needs to
be generated, because they can be referenced by
other field or function declarations. Remember that
the resolved tree is guaranteed to list every
ResolvedStructDecl
before any other
declaration, so there is no need to implement any
complex logic in this step.
After every struct declaration has been given its own opaque type in the IR, each field can be represented with an existing type.
Turn the currently opaque structs into complete struct declarations by generating and setting their bodies.
%struct.S = type opaque
%struct.S2 = type opaque
The above structs with their bodies correctly set might look like the ones below.
%struct.S = type { %struct.S2 }
%struct.S2 = type { double, double }
With struct types being supported, it's possible to generate code for struct instantiations. In LLVM the way to instantiate structs is to allocate space for them first and then initialize each member separately.
Color { r: initR(), g: initG(), b: initB() }
When generating code for a struct instantiation expression, the space that stores the struct must be allocated explicitly, only then can the fields of this explicitly allocated struct be initialized to the given values. Field initialization happens in declaration order, but the side effects must be visible in instantiation order.
To make the previous specification easier to understand, take a look at this example that shows the steps of how a struct instantiation would be generated in LLVM IR.
Declaration:
struct Color { r: number, g: number, b: number }
Instantiation:
Color { g: initG(), r: initR(), b: initB() }
Generated logic:
let tmp: Color;
let gInit = initG();
let rInit = initR();
let bInit = initB();
tmp.r = rInit;
tmp.g = gInit;
tmp.b = bInit;
Allocating a new variable is straightforward, the resulting IR will look like this.
%Color.tmp = alloca %struct.Color, align 8
Accessing a field of a struct instance, however, was
not necessary until now. In LLVM IR, the field of a
struct is accessed by its index rather than by its
name, which is also the reason for storing the index
of the field in a
ResolvedFieldDecl
.
The field is accessed with the
GetElementPointer
instruction. This
instruction can be confusing at first, so it's
recommended to let the
llvm::IRBuilder::CreateStructGEP()
helper create it. This helper takes the type of the
struct, a pointer to the beginning of the struct and
the index of the element that needs to be accessed.
Design Note
To learn more about the GEP instruction, the LLVM documentation has a dedicated page, which explains it in full detail and why it was designed this way.
Initializing tmp.r
means getting the
pointer to the r
field and assigning
the initial value to it.
%0 = getelementptr inbounds %struct.Color, %struct.Color* %Color.tmp, i32 0, i32 0
store double 1.000000e+00, double* %0, align 8
The full LLVM IR for instantiating the
Color
struct at the top of this point
can be found below.
%Color.tmp = alloca %struct.Color, align 8
%0 = call double @initR()
%1 = call double @initG()
%2 = call double @initB()
%3 = getelementptr inbounds %struct.Color, %struct.Color* %Color.tmp, i32 0, i32 0
store double %0, double* %3, align 8
%4 = getelementptr inbounds %struct.Color, %struct.Color* %Color.tmp, i32 0, i32 1
store double %1, double* %4, align 8
%5 = getelementptr inbounds %struct.Color, %struct.Color* %Color.tmp, i32 0, i32 2
store double %2, double* %5, align 8
Note that the example struct deliberately has only
number
fields. Assigning values to
struct fields works differently, which will be
discussed later.
In the previous point, the fields of a struct have
already been accessed, so it's the perfect time for
lowering a
ResolvedMemberExpr
.
To read the field of a struct, a
ResolvedDeclRefExpr
that references the
struct needs to be evaluated at some point. While
numbers are loaded into a register when referenced,
structs are not loaded into registers, as they might
not fit, so it must be ensured that a
ResolvedDeclRefExpr
referencing a
struct always results in a pointer to that struct
without being followed by a load instruction.
%Color.tmp = alloca %struct.Color, align 8
...
%3 = ..., %struct.Color* %Color.tmp, ...
A member access results in a
GetElementPointer
instruction on the
result of the base expression, however, there is a
difference in how it is followed up depending on
whether the field is being written or read.
When the value of a numeric field is being read, it has to be loaded from memory.
%3 = getelementptr inbounds %struct.Color, %struct.Color* %Color.tmp, i32 0, i32 0
%4 = load double, double* %3, align 8
On the other hand, if the field is being assigned,
the store
instruction expects a pointer
operand, so the field must not be loaded.
%3 = getelementptr inbounds %struct.Color, %struct.Color* %Color.tmp, i32 0, i32 0
store double 1.000000e+00, double* %3, align 8
At this point creating a temporary struct and reading one of its fields immediately should work correctly, but it would also be convenient to store them in variables, initialize their struct fields or assign a different value to a struct variable.
All of the above use cases are handled by the same
logic. As mentioned in a previous step, the
difference between structs and numbers is that
structs are not loaded into registers, so the
store
instruction cannot be used to
mutate their values. Any time a struct is assigned,
the value of each field must be manually assigned to
the same field of the destination struct.
Statement:
let s = Color { r: 1, g: 2, b: 3 };
Generated logic:
let tmp: Color;
tmp.r = 1;
tmp.g = 2;
tmp.b = 3;
let s: Color;
s.r = tmp.r;
s.g = tmp.g;
s.b = tmp.b;
Manually assigning every field, however, would
result in duplicating the work and generating large
LLVM IR fragments. The goal is to make the content
of the variable storing the struct on the LHS the
same as the struct on the RHS. To achieve this
easily, LLVM has a builtin
memcpy
intrinsic, which copies a
specific amount of memory from one place to another.
With that, an assignment between 2 struct variables
can be simplified to only a few lines.
%3 = bitcast %struct.Color* %s to i8*
%4 = bitcast %struct.Color* %Color.tmp to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %3, i8* align 8 %4, i64 24, i1 false)
Note that this is the version of
memcpy
created by the
llvm::IRBuilder::CreateMemCpy()
helper.
To use this helper, the alignment and the size of
the struct must be calculated first, which can be
done easily with the help of the
llvm::DataLayout
utility.
Design Note
Intrinsics are built-in functions in LLVM, that let the backends know what the desired effect is at that specific point of the IR. The backends then use this information to find the most efficient way to achieve it.
More intrinsics can be found in the LLVM Language Reference Manual.
The full LLVM IR for the variable declaration can be seen below.
%s = alloca %struct.Color, align 8
%Color.tmp = alloca %struct.Color, align 8
%0 = getelementptr inbounds %struct.Color, %struct.Color* %Color.tmp, i32 0, i32 0
store double 1.000000e+00, double* %0, align 8
%1 = getelementptr inbounds %struct.Color, %struct.Color* %Color.tmp, i32 0, i32 1
store double 2.000000e+00, double* %1, align 8
%2 = getelementptr inbounds %struct.Color, %struct.Color* %Color.tmp, i32 0, i32 2
store double 3.000000e+00, double* %2, align 8
%3 = bitcast %struct.Color* %s to i8*
%4 = bitcast %struct.Color* %Color.tmp to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %3, i8* align 8 %4, i64 24, i1 false)
Design Note
Upon taking a look at how Clang generates code for an equivalent snippet, the result will most likely be different. If the values of the initializers are known in compile time, Clang creates a constant struct and a
memcpy
intrinsic instead, so the backends can choose the most optimal way to initialize the struct in machine code.@__const.clangKnownInitializers.c = private unnamed_addr constant %struct.Color { i32 1, i32 2, i32 3 }, align 4 define void @clangKnownInitializers() { entry: %c = alloca %struct.Color, align 4 call void @llvm.memcpy.p0.p0.i64(ptr align 4 %c, ptr align 4 @__const.clangKnownInitializers().c, i64 12, i1 false) ret void }
Another optimization Clang performs is copy-elision. If copying the struct has no side effect, there is no need to create a temporary first and emit a
memcpy
. The initializer values can be stored directly in the space of the local variable.define void @copyElision(i32 %0) { %2 = alloca i32, align 4 %3 = alloca %struct.Color, align 4 store i32 %0, i32* %2, align 4 %4 = getelementptr inbounds %struct.Color, %struct.Color* %3, i32 0, i32 0 %5 = load i32, i32* %2, align 4 store i32 %5, i32* %4, align 4 %6 = getelementptr inbounds %struct.Color, %struct.Color* %3, i32 0, i32 1 %7 = load i32, i32* %2, align 4 store i32 %7, i32* %6, align 4 %8 = getelementptr inbounds %struct.Color, %struct.Color* %3, i32 0, i32 2 %9 = load i32, i32* %2, align 4 store i32 %9, i32* %8, align 4 ret void }
By this point, structs should be fully functional within the body of a function. The last thing to support is passing them to and returning them from other functions.
Structs in your language are passed to and returned from functions by value, however, as mentioned previously, in LLVM IR only the pointer to the struct can be moved around, it's not possible to load them and pass the loaded value. Passing by value, however, only means that no modification to the parameter is visible in the caller, and this can be achieved in LLVM IR easily.
For mutable parameters, the trick is to create a copy of the struct in the caller and pass a pointer to that copy.
%c = alloca %struct.Color, align 8
%struct.arg.tmp = alloca %struct.Color, align 8
...
%5 = bitcast %struct.Color* %struct.arg.tmp to i8*
%6 = bitcast %struct.Color* %c to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %5, i8* align 8 %6, i64 24, i1 false)
call void @mutableStructParam(%struct.Color* byval(%struct.Color) %struct.arg.tmp)
Notice how the byval
attribute is added
to the function. It is a way to let LLVM know that
the particular parameter simulates a pass-by-value
semantic and when it does that, this attribute must
be present at both the function declaration and the
call site. Note that byval
also
requires the copy to be created in the caller.
define void @mutableStructParam(%struct.Color* byval(%struct.Color) %c) {
...
}
call void @mutableStructParam(%struct.Color* byval(%struct.Color) %struct.arg.tmp)
When the parameter is immutable, there is no need to
copy the struct, as there are no modifications that
can be visible in the caller. To indicate that the
struct will never be modified inside the function
through the passed pointer, the
readonly
attribute can be used. This
attribute is not compulsory though, and can be
omitted, but as with every other attribute, it must
also be present at both the function declaration and
the call site.
%c = alloca %struct.Color, align 8
...
call void @immutableStructParam(%struct.Color* readonly %c)
Returning the struct is also done through a
dedicated parameter. If the function returns a
struct by value, a temporary variable that stores
the returned value is allocated in the caller, and
the pointer to this variable is passed as the first
argument of the function, marked by the compulsory
sret
attribute. The return statement
then copies the struct to the space allocated in the
caller, and the function is treated as if it were a
void function.
define void @caller() {
entry:
%struct.ret.tmp = alloca %struct.Color, align 8
call void @returnStruct(%struct.Color* sret(%struct.Color) %struct.ret.tmp)
ret void
}
define void @returnStruct(%struct.Color* sret(%struct.Color) %ret) {
entry:
%Color.tmp = alloca %struct.Color, align 8
...
%3 = bitcast %struct.Color* %ret to i8*
%4 = bitcast %struct.Color* %Color.tmp to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %3, i8* align 8 %4, i64 24, i1 false)
br label %return
return:
ret void
}
Design Note
Sadly, using
sret
andbyval
arguments is not an all-round solution. Different platform-specific ABIs might require different ways of passing and returning struct values.struct Coerced { int x; int y; }; Coerced passCoerced(Coerced c) { return c; }
In the previous example, on the
x86-64
platform,Coerced
has the size of 64 bits and it fits into a single register, so Clang lowerspassCoerced
as a function that takes and returns ani64
value to conform to the ABI.define i64 @passCoerced(i64 %c.coerce) { entry: %retval = alloca %struct.Coerced, align 4 %c = alloca %struct.Coerced, align 4 store i64 %c.coerce, ptr %c, align 4 call void @llvm.memcpy.p0.p0.i64(ptr align 4 %retval, ptr align 4 %c, i64 8, i1 false) %0 = load i64, ptr %retval, align 4 ret i64 %0 }
Larger structs might be passed and returned in 2 separate registers, where the latter is expressed with returning a literal struct.
%struct.LargerCoerced = type { i32, i32, i32, i32 } define { i64, i64 } @passLargerCoerced(i64 %l.coerce0, i64 %l.coerce1) { entry: %retval = alloca %struct.LargerCoerced, align 4 ... %2 = load { i64, i64 }, ptr %retval, align 4 ret { i64, i64 } %2 }
It's the compiler's responsibility to ensure that the generated code conforms to the ABI of the target platform and thus maintains compatibility.
Implementing full support for a specific ABI alone could also have its dedicated guide, so it's not in the scope of this exercise. It would only become necessary if making your language interoperable with C, C++, or any other language that compiles to native code, was also a requirement. The only external function used by your language, that was compiled by a different compiler is
printf
, but it neither receives nor returns a struct, so compatibility issues will not arise.For more details about the topic, see some discussions on LLVM mailing lists, like the one about struct parameters being converted to other types or the one on passing structures in registers.
Congratulations on reaching the end of this exercise, the compiler now fully supports structs in your language!
Thank you for taking the time to read through this guide on how to write a compiler with LLVM! Hopefully, it was time well spent and an interesting experience to explore the different fields of language design, parsing, static code analysis, and code generation throughout the previous few pages.