Extending the Language With Structs

This section introduces a self-paced exercise to extend the language with structs. The subsections only describe the requirements and the theoretical background needed for every step, figuring out the implementation details is up to the reader. Imagine that this is a real-world scenario of working as a compiler engineer and getting a task to extend the language based on certain criteria.

One possible solution to this exercise is already the part of the compiler and can be found on GitHub.

Motivation

The motivation behind structs is to pack values together and make it more convenient to pass them around simultaneously, as well as to make the source code more expressive.

fn rgbToGray(r: number, g: number, b: number): number {
  return (r + g + b) / 3;
}

Maintaining 3 different variables to store each individual color channel value can become overwhelming quickly and also affects readability and maintainability. Expressing the same logic with structs makes it more readable and maintainable.

struct RGBColor {
  r: number,
  g: number,
  b: number,
}

fn rgbToGray(c: RGBColor): number {
  return (c.r + c.g + c.b) / 3;
}

Warm-up

Whether extending the compiler from the lexer to code generation seems overwhelming at first, or it looks like a walk in the park, doing a small task to get used to working on the codebase is always a good idea.

In the current version of the language, parameters are passed by value, can only be of number type and are always immutable. With struct parameters however, there can be use cases when only one or two fields of the struct need to be changed within the body of the function. With immutable parameters, the only way to achieve this is to duplicate the struct into a local variable and then change the fields of this variable.

fn removeGreenChannel(c: RGBColor): RGBColor {
  var greenRemoved = c;
  greenRemoved.g = 0;

  return greenRemoved;
}

Design Note

The question might arise, why structs are passed by value like in C or Rust instead of passing only a reference to the struct like in Kotlin or Swift.

Every object has a lifetime, thus every piece of memory that was once allocated must be freed eventually. In the worst case scenario the memory is freed when the program exits, unless the program is designed to be running infinitely like a web server. In that case if unused memory is not freed somehow, the program can quickly run out of memory.

With stack-allocated variables managing lifetimes is straightforward. When the function returns and the stack space is destroyed, the object is considered to be freed.

fn returnStruct(): S {
  let s = S { ... };

  return s;
}

Imagine that structs are moved around by the reference pointing to them. That means, in this example a reference to s is returned from the function, however, if s is on the stack, it is freed after the function returns. Where does the reference point then? It points to the piece of memory that used to store s, but is now freed, so the reference is dangling, pointing to an unknown piece of memory. Moving structs around by value avoids this issue automatically.

But what about those languages, that have been mentioned to work with references? How can they do that? To understand this, 2 questions need to be explored first. How the memory is allocated and how it is freed.

The first one is straightforward, in these languages the memory for the struct is allocated on the heap upon instantiation. Allocating it on the stack would result in the dangling reference problem discussed previously.

For the second question different languages propose different solutions. In C heap-allocated objects are freed manually by calling the free() function, however, this requires the developer to keep track of which block of memory needs to be freed and which has already been freed, so there are lots of possibilities to make a mistake at some point. To reduce the risk of memory corruption other languages such as Kotlin or Swift use solutions that automatically keep track of which memory blocks need to be freed and kept alive.

Kotlin runs on the JVM, which uses tracing garbage collection (GC), where a utility known as the garbage collector keeps track of which pieces of memory can be accessed through the available references at a specific point of the program from a so-called root object. When a reference goes out of scope and no other accessible references point to the chunk of memory in question, it is considered unreachable, otherwise, it is reachable. The garbage collector periodically scans all the allocated memory blocks and frees those that are unreachable.

1.) an A object is created

                                    ┌─────────┐
                                    │ RootObj │
   6 | fun main() {                 └─────────┘
-> 7 |   val a = A()                     │
   8 |   wrapper(a)                      v
   9 | }                              ┌─────┐
                                      │ A@7 │
                                      └─────┘

2.) wrapper is called and a B object is created

                                    ┌─────────┐
   1 | fun wrapper(a: A) {          │ RootObj │
-> 2 |   var b = B()                └─────────┘
   3 |   b = B()                    ┌────┴────┐
   4 |   a.b = b                    v         v
   5 | }                         ┌─────┐   ┌─────┐
                                 │ A@7 │   │ B@2 │
                                 └─────┘   └─────┘

3.) the same reference is pointed to another B object

                                    ┌─────────┐
   1 | fun wrapper(a: A) {          │ RootObj │
   2 |   var b = B()                └─────────┘
-> 3 |   b = B()                 ┌───────┴───────┐
   4 |   a.b = b                 v               v
   5 | }                      ┌─────┐ ┌─────┐ ┌─────┐
                              │ A@7 │ │ B@2 │ │ B@3 │
                              └─────┘ └─────┘ └─────┘

4.) a reference in A@7 is pointed to B@3

                                    ┌─────────┐
                                    │ RootObj │
   1 | fun wrapper(a: A) {          └─────────┘
   2 |   var b = B()             ┌───────┴───────┐
   3 |   b = B()                 v               v
-> 4 |   a.b = b              ┌─────┐ ┌─────┐ ┌─────┐
   5 | }                      │ A@7 │ │ B@2 │ │ B@3 │
                              └─────┘ └─────┘ └─────┘
                                 │               ^
                                 └───────────────┘

5.) return and the reference to B@3 goes out of scope

                                    ┌─────────┐
                                    │ RootObj │
                                    └─────────┘
   6 | fun main() {              ┌───────┘
   7 |   val a = A()             v
-> 8 |   wrapper(a)           ┌─────┐ ┌─────┐ ┌─────┐
   9 | }                      │ A@7 │ │ B@2 │ │ B@3 │
                              └─────┘ └─────┘ └─────┘
                                 │               ^
                                 └───────────────┘

6.) execution halts, the GC marks the reachable nodes

                                    ╔────═────╗
                                    │ RootObj │
                                    ╚────═────╝
   6 | fun main() {              ┌───────┘
   7 |   val a = A()             v
-> 8 |   wrapper(a)           ╔──═──╗ ┌─────┐ ╔──═──╗
   9 | }                      │ A@7 │ │ B@2 │ │ B@3 │
                              ╚──═──╝ └─────┘ ╚──═──╝
                                 │               ^
                                 └───────────────┘

7.) any unmarked node is freed

                                    ┌─────────┐   
                                    │ RootObj │   
   6 | fun main() {                 └─────────┘   
   7 |   val a = A()                ┌────┘   
-> 8 |   wrapper(a)                 v  
   9 | }                         ┌─────┐   ┌─────┐
                                 │ A@7 │──>│ B@3 │
                                 └─────┘   └─────┘

8.) the execution continues

    ...
                            

Swift uses a different technique called automatic reference counting (ARC). In this case, the objects themselves have a counter attached to them, which tracks how many reachable references point to the object. When a reference goes out of scope, the counter is decremented, and if the reference count reaches 0, the object is freed.

1.) callee is entered and an A class is instantiated

   4 | func callee() {       ┌─────┐    ┌───────────┐
-> 5 |   let a = A()         │ A@5 │ ─> │ RefCnt: 1 │
   6 |   wrapper(a: a)       └─────┘    └───────────┘
   7 | }

2.) wrapper is entered, a local reference is created

   1 | func wrapper(a: A) {  ┌─────┐    ┌───────────┐
-> 2 |   let b = a           │ A@5 │ ─> │ RefCnt: 2 │
   3 | }                     └─────┘    └───────────┘

3.) return and the local reference goes out of scope

   4 | func callee() {       ┌─────┐    ┌───────────┐
   5 |   let a = A()         │ A@5 │ ─> │ RefCnt: 1 │
-> 6 |   wrapper(a: a)       └─────┘    └───────────┘
   7 | }

4.) callee returns, the reference count reaches zero

                             ┌─────┐    ┌───────────┐
-> 8 | callee()              │ A@5 │ ─> │ RefCnt: 0 │
                             └─────┘    └───────────┘

5.) the object is freed

-> 8 | callee()

Both GC and ARC execute additional code in runtime, so they impose some performance overhead and require additional logic to be performed by the compiler during code generation. And of course, they are also a lot more complex than these oversimplified examples make it seem.

Because the struct is passed by value, no modification to it would be visible to the caller anyway, so copying it to an additional local variable doesn't make sense. To resolve this inconvenience, it's time to introduce support for mutable parameters, which are marked with the var keyword. Using the var to denote parameter mutability keeps the language consistent with local variable declarations where var is also used to express mutability.

fn removeGreenChannel(var c: RGBColor): RGBColor {
  c.g = 0;
  return c;
}

Design Note

The var parameter in it's current form also used to be the part of Kotlin and Swift in their early days, but in the case of both languages the language designers decided to remove it eventually.

On the other hand, Rust still has the same concept, except for mut being written instead of var.

fn rust_mutable_param(mut c: RGBColor) { ... }

The key difference between the mentioned languages is how object parameters are passed to functions. In Rust the instance of the struct is passed by value, but in case of Kotlin and Swift classes, it's the reference to the instance that gets passed by value, like passing a pointer to the instance in C/C++. Swift is not so straightforward though, as it also supports structs, not just classes and structs are passed by value like in Rust.

Anyway, in case of Swift var parameters caused a confusion because developers expected that by making the parameter mutable, the mutation will be visible outside the function, but Swift had the inout keyword for that and var only allowed the mutation to be visible within the context of the function. The full proposal for removing var parameters is available on GitHub.

Apparently in Kotlin, var caused the same confusion that it did in Swift. In Kotlin it's always the reference to the object that is passed by value and by making the reference mutable and assigning a new value to it, one might think it wasn't the local copy of the reference pointed to a new object but the object pointed by the reference replaced with a new one, so the new value will be visible outside the function. For more details see the blog post about the change.

In both Rust and your language the instances of structs are passed by value, not by references pointing to them, so the mutability of the parameter doesn't suggest that the change is visible in the caller too.

To implement this simple, but convenient change proceed with the following few steps.

  1. Update the parser so that it accepts the following paramDecl grammar extension.

    <paramDecl>
      ::= 'var'? <identifier> ':' <type>
    

    Don't forget to store whether the parameter is mutable or not in the AST similar to VarDecl, as the semantic analyzer needs this information to ensure that only the allowed operations are performed on the value.

  2. In the semantic analyzer ensure that the parameter declaration is resolved properly and modifying an immutable parameter still results in an error, but changing a mutable parameter is allowed. The parameter of the built-in println should remain immutable.

    fn paramSema(x: number, var y: number): void {
      y = 0;
      x = 1;
        ^ error: 'x' cannot be mutated
    }

    The mutation of the value is checked in the one and only dataflow analysis pass, which expects the LHS of an assignment to be a ResolvedVarDecl, however this is no longer the case. To make modifying this pass and later the code generation easier, lift the isMutable field up from ResolvedVarDecl to the ResolvedDecl base class.

    struct ResolvedDecl {
      ...
      bool isMutable;
    
      ...
    };

    This property should automatically be set to false for every resolved declaration except for variables and parameters, for which its value should be deduced from the AST. Note that this change also automatically forwards the mutability of the parameter to Codegen, where it can be used for more efficient LLVM IR generation.

    Remember to update the dataflow analysis pass, so that lattices also store the now updated ResolvedDecl instead of the previous ResolvedVarDecl.

    using Lattice = std::map<const ResolvedDecl *, State>;
  3. During LLVM IR generation, in the body of the function, a local variable is introduced for each parameter, and their values are copied into these variables. When the value of the parameter is read, LLVM loads the corresponding local variable and uses its value for further computation.

    define void @varParam(double %n) {
    entry:
      %n1 = alloca double, align 8
      store double %n, double* %n1, align 8
      %0 = load double, double* %n1, align 8
      call void @println(double %0)
      ret void
    }

    Notice how the function in the example is called varParam(). This is not a coincidence, because in truth this local variable only needs to be introduced if the parameter is mutable, as its sole purpose is to keep the changes to the parameter visible inside the given function only.

    The language is designed in such a way that the semantic analyzer can ensure that every code modifying an immutable parameter is rejected before code generation. This allows Codegen to take advantage of this information and generate more optimized LLVM IR by omitting the local variable creation for immutable parameters.

    define void @nonVarParam(double %n) {
    entry:
      call void @println(double %n)
      ret void
    }

    Design Note

    When taking a look at how Clang generates code for C++ functions, it can be observed that the local variable is always present even in cases when the parameter is of const type.

    There are mainly two reasons for this. The first is that constness can be cast away, so the parameter can still be mutated and the second is the & operator.

    void immutableParameter(const int x) {
      *const_cast<int *>(&x) = 2;
    }

    The & operator gets the address of the parameter, but to do that, it must first allocate memory for it, returning the memory address of a value in a register is not possible.

    The same operator is also present in Rust, but the Rust compiler only allocates the variable if it is used. In a function, where the parameter is not borrowed, it's not copied into a local variable either.

    Update Codegen so that a local variable is no longer introduced for immutable parameters. Also, ensure that when a ResolvedDeclRefExpr points to such a parameter, its value is no longer loaded. The generated LLVM IR for mutable parameters should remain unchanged.

The Struct Syntax

The first bigger step towards supporting structs in the language is correctly recognizing their syntax, which means teaching it to the parser. This section describes how the syntax was designed and explains every step to take to end up with the parser correctly recognizing it.

  1. Notice how the example in the Motivation sections already spoiled all the new tokens that the lexer must recognize. These are the struct keyword and the . when a field is accessed. Let's update the lexer so that it produces TokenKind::KwStruct and TokenKind::Dot, when it sees them.

  2. The introduction of structs changes the top level of the source file from a list of function declarations to a list of function and struct declarations.

    <sourceFile> 
      ::= (<structDecl> | <functionDecl>)* EOF

    To be able to see the changes and visualize the AST dump as soon as possible, make sure the parser returns a list of Decl instead of the current list of FunctionDecl as the AST. This makes the -ast-dump flag work and print out the full AST for better debugging experience.

    std::pair<std::vector<std::unique_ptr<Decl>>, bool> parseSourceFile();

    The AST should preserve the declaration order in the source code, so if a struct is declared before a function in the source code, it should also precede it in the AST.

    Note that after changing the type of the returned AST in the parser, the driver will not compile because the next step in the compilation pipeline still expects a list of FunctionDecl. To make the compiler still produce an executable from every already supported source code, introduce a temporary adapter in the driver that extracts the function declarations from the AST and only passes those further down the pipeline. This adapter will be removed at the beginning of the next section.

  3. The first new element to recognize is the struct declaration, which as mentioned appears on the top level and only on the top level, so nested struct declarations are not allowed. It starts with the struct keyword followed by a {} enclosed list of zero or many field declarations.

    struct S { ... }

    A field declaration is a : separated pair of an identifier and a type. In the body of a struct these declarations are separated by commas with an optional trailing comma at the end.

    struct S {
      field: type,
      field: type,
      field: type,
    }

    By adding the trailing comma, the struct can also be declared conveniently on a single line only.

    struct S { field: type, field: type, field: type }

    This syntax leads to the following extension of the grammar.

    <structDecl>
      ::= 'struct' <identifier> <fieldList>
    
    <fieldList>
      ::= '{' (<fieldDecl> (',' <fieldDecl>)* ','?)? '}'
    
    <fieldDecl>
      ::= <identifier> ':' <type>

    Update the parser so that it recognizes this grammar extension and produces the following AST nodes. Note that the constructors have been omitted from the example, but they need to be present in the source code.

    struct FieldDecl : public Decl {
      Type type;
    
      ...
    
      void dump(size_t level = 0) const override {
        std::cerr << indent(level) << "FieldDecl: "
                  << identifier << ':' << type.name
                  << '\n';
      }
    };
    
    struct StructDecl : public Decl {
      std::vector<std::unique_ptr<FieldDecl>> fields;
    
      ...
    
      void dump(size_t level = 0) const override {
        std::cerr << indent(level) << "StructDecl: "
                  << identifier << '\n';
      
        for (auto &&field : fields)
          field->dump(level + 1);
      }
    };
  4. The syntax of instantiating a struct is almost identical to declaring it except for the missing struct keyword in front of the identifier and that in the {} enclosed list of fields, the : is not followed by a type, but an expression that sets the value of that field, thus ensuring that every field is initialized and making it clear which field has been initialized to what value.

    S {
      field: expr, 
      field: expr, 
      field: expr,
    }

    The struct instantiation can also be written on a single line conveniently because of the optional trailing comma at the end of the field initializer list.

    S { field: expr, field: expr, field: expr }

    Instantiating a struct is considered a primary expression, so the extended grammar looks like the one below.

    <primaryExpression>
      ::= ...
      |   <structInstantiation>
      |   <declRefExpr>
      |   ...
    
    <structInstantiation>
      ::= <identifier> <fieldInitList>
    
    <fieldInitList>
      ::= '{' (<fieldInit> (',' <fieldInit>)* ','?)? '}'
    
    <fieldInit>
      ::= <identifier> ':' <expr>

    Notice how structInstantiation is listed before declRefExpr. Each of these expressions start with an identifier, but a structInstantiation has a priority over a declRefExpr, which ensures that if an expression can be parsed as a struct instantiation, it is always parsed as one, instead of parsing a declRefExpr and erroring out on the { token.

    Error { ... };
          ^
    error: expected ';' at the end of expression

    Update the parser so that it correctly recognizes the new primary expression and produces the following AST nodes. The constructors are once again omitted.

  5. struct FieldInitStmt : public Stmt {
      std::string identifier;
      std::unique_ptr<Expr> initializer;
    
      ...
    
      void dump(size_t level = 0) const override {
        std::cerr << indent(level) 
                  << "FieldInitStmt: " << identifier
                  << '\n';
        initializer->dump(level + 1);
      }
    };
    
    struct StructInstantiationExpr : public Expr {
      std::string identifier;
      std::vector<std::unique_ptr<FieldInitStmt>>
        fieldInitializers;
    
      ...
    
      void dump(size_t level = 0) const override {
        std::cerr << indent(level) 
                  << "StructInstantiationExpr: "
                  << identifier << '\n';
      
        for (auto &&field : fieldInitializers)
        field->dump(level + 1);
      }
    };
  6. Notice how the struct instantiation syntax introduces an ambiguity in the condition of an if or while statement.

    if condition { ... }

    Because of the grammar modification, this would be parsed as an if followed by a structInstantiation with a missing body.

      structInstantiation
       ┌───────┴───────┐                          
    if condition { ... } ...
                         ^ error: expected 'if' body
    

    To resolve this ambiguity, a structInstantiation is not allowed to be placed inside a condition and the parser should ensure that the above example is always parsed as a declRefExpr followed by a block.

    IfStmt
      DeclRefExpr: condition
      Block

    If for any reason a struct needs to be instantiated in a condition, it can be done by wrapping it in a grouping expression, which makes it clear, where the body of the function starts.

    if (identifier {}) {}

    Implement a logic similar to the Rust parser, that makes it possible to apply certain restrictions when an AST node is parsed.

    When an if or while condition is parsed, apply a restriction that tells the parser to ignore struct instantiations. Upon entering a grouping expression, every restriction that is currently applied, should be cleared until the wrapped expression is consumed.

    Statement: if (condition {}) {}
    
    |  parseStmt()                  [NoRestrictions]
    |    parseIfStmt()
    |      parseExpr()              [StructNotAllowed]
    |       ...
    |         parseGroupingExpr()  
    |           parseExpr()         [NoRestrictions]
    |             ...
    |           parseExpr()
    |         parseGroupingExpr()   [StructNotAllowed]
    |       ...
    |      parseExpr()
    |      parseBlock()             [NoRestrictions]
    |       ...
    
  7. After declaring and instantiating structs, the final syntactical extension makes it possible to access their fields.

    Since a struct can have fields that are also structs, any arbitrarily deep field can be read or written. A struct can also be returned from a function and reading or assigning the fields of the returned struct is also allowed, which means all the following expressions should be accepted.

    s.f1.f2.f3
    foo().f1.f2

    On the other hand, a struct cannot have function fields or methods, so calling an accessed field is not possible.

    not.possible()

    The restriction that a call expression, if present, must be the first expression in the chain can be enforced in the parser by modifying postfixExpression the way below.

    <postfixExpression>
      ::= <primaryExpression> <argumentList>? <memberExpr>*
    
    <memberExpr>
      ::= '.' <identifier>

    Update the parser so that it correctly consumes member expressions and creates the following node for them. Like in the previous examples, the constructors are still omitted.

    struct MemberExpr : public Expr {
      std::unique_ptr<Expr> base;
      std::string member;
    
      ...
    
      void dump(size_t level = 0) const override {
        std::cerr << indent(level) << "MemberExpr: ."
                  << member << '\n';
      
        base->dump(level + 1);
      }
    };
  8. The previous point mentioned that a MemberExpr can also be used to assign a value to a field, however an assignment statement is still restricted so that it only accepts a DeclRefExpr on its left-hand side, thus getting the following code rejected.

    s.f = 1;

    Making a field assignable while still maintaining a strict restriction on the LHS of the assignment can be achieved with another simple grammar modification.

    <assignment>
      ::= (<declRefExpr> | <memberExpr>) '=' <expr> ';'

    To model the grammar change in the AST, introduce an AssignableExpr placeholder node and make it the base class of DeclRefExpr and MemberExpr. The Assignment node should also store this node instead of a DeclRefExpr.

    struct AssignableExpr : public Expr { ... };
    
    struct DeclRefExpr : public AssignableExpr { ... };
    
    struct MemberExpr : public AssignableExpr { ... };
    
    struct Assignment : public Stmt {
      std::unique_ptr<AssignableExpr> assignee;
      ...
    };

    Update the parser so that it also accepts a MemberExpr on the LHS of an assignment. Remember that the semantic analyzer still expects the LHS to be a DeclRefExpr, which will be fixed in the next section. Until then, introduce a temporary workaround in Sema::resolveAssignment() to keep the compiler functioning.

  9. Tweak the error recovery mechanism, so that synchronization happens on the struct keyword too. The synchronizeOn() utility should also be updated to accept a list of tokens instead of a single token only, as on the top level both struct and fn is a synchronization point and synchronization should happen on the keyword that is encountered first.

    fail struct ... fn ...
    ┌─────┐ ┌────────┐ ┌─────┐ ┌────┐ ┌─────┐
    │ ... │ │ struct │ │ ... │ │ fn │ │ ... │
    └─────┘ └────────┘ └─────┘ └────┘ └─────┘
     ^        ^                  ^
     |        └ synchronize here └ don't skip to here
     └ fail here

    If a failure happens inside a struct instantiation, the parser should manually synchronize on the next } token instead of waiting for an automatic synchronization on the block level, as the parser might think the } is the end of the statement, however, that is not always the case.

    S { f1: fail, f2: 0 }
    ┌─────┐ ┌───┐ ┌────┐ ┌───┐ ┌───┐ ┌───┐
    │ ... │ │ , │ │ f2 │ │ : │ │ 0 │ │ } │
    └─────┘ └───┘ └────┘ └───┘ └───┘ └───┘
     ^                                 ^
     └ fail here                       └ sync here

Static Analysis on Structs

After structs are recognized by the compiler, the next step is to ensure that they are used correctly. This section explains how to extend the semantic analyzer to detect the invalid use of structs and ensure there are no issues in the program.

  1. The parser now returns a list of Decl as the AST, which is passed to the semantic analyzer through an adapter in the driver.

    Remove this adapter from the driver and update Sema so that it takes a list of Decl and returns a list of ResolvedDecl as the new resolved tree.

    class Sema {
      ...
    
    public:
      explicit Sema(std::vector<std::unique_ptr<Decl>> ast)
          : ast(std::move(ast)) {}
    
      std::vector<std::unique_ptr<ResolvedDecl>> resolveAST();
    };

    Ensure that the -cfg-dump flag still works as expected. Only functions can have a control flow graph, so the driver should only run CFGBuilder on a ResolvedFunctionDecl.

    Also Codegen still expects a list of ResolvedFunctionDecl as the resolved tree. To keep it working, introduce another temporary adapter similar to the just removed one between Sema and Codegen.

  2. Extend Type with an additional structType. The kind field should store a new Struct kind and the name field should store the identifier of the struct.

    static Type structType(const std::string &id) { return {Kind::Struct, id}; }

    Two struct types are considered equal if their identifiers are equal. Ensure in the semantic analyzer that types are compared by their name field instead of their kind field, where it's applicable.

  3. Structs are declared on the top level and are location independent, which means that a struct doesn't have to be declared before it can be used. Due to this property, they are also resolved in multiple passes just like functions.

    They also need to be resolved before function declarations because a function can have a struct parameter, or it can return a struct and in these cases, the compiler needs to make sure that the struct is declared in the source code.

    fn returnS(): S { ... }
                  ^ 'S' must be found when resolving 
                    the function declaration
    
    struct S { ... }

    Implement the first pass of struct resolution that results in the following nodes.

    struct ResolvedFieldDecl : public ResolvedDecl {
      unsigned index;
    
      ...
    
      void dump(size_t level = 0) const override {
        std::cerr << indent(level) 
                  << "ResolvedFieldDecl: @(" 
                  << this << ") " << identifier 
                  << '\n';
      }
    };
    
    struct ResolvedStructDecl : public ResolvedDecl {
      std::vector<std::unique_ptr<ResolvedFieldDecl>> 
        fields;
    
      ...
    
      void dump(size_t level = 0) const override {
        std::cerr << indent(level) 
                  << "ResolvedStructDecl: @(" 
                  << this << ") " << identifier 
                  << ':' << '\n';
      
        for (auto &&field : fields)
          field->dump(level + 1);
      }
    };

    Notice how ResolvedFieldDecl also stores the index of itself within the struct. This will be important during code generation.

    This pass should only check if each field declaration is unique and insert the structs to the symbol table. Don't resolve the type of the field declarations yet.

    struct S { 
        f: number,
        f: number,
        ^ error: field 'f' is already declared
    }

    To make code generation easier ensure that in the resolved tree every ResolvedStructDecl precedes any ResolvedFunctionDecl. In other words, the resolved tree should have the following structure.

    ResolvedStructDecl
           ...
    ResolvedStructDecl
    ResolvedFunctionDecl
           ...
    ResolvedFunctionDecl
  4. To resolve a Type::Kind::Custom type to a Type::Kind::Struct type, the semantic analyzer needs to check whether the identifier of the custom type references a known ResolvedStructDecl.

    The current lookup logic only supports looking up ResolvedDecl based on its identifier, which can lead to cases when the wrong declaration is found.

    let S = 1;
    let x: S;
           ^ here it needs to be checked if this is an
             existing struct, but looking up 'S' by 
             its identifier returns the variable above

    Update the symbol lookup logic, so that the type of the searched declaration can also be specified and instead of returning the first declaration with the same name, it returns the first declaration with a matching type as well.

    template <typename T>
    std::pair<T *, int> Sema::lookupDecl(const std::string id) { ... }

    Use the updated lookup logic to resolve a Type::Kind::Custom. This change should ensure that the following snippet is compiled properly.

    struct S {}
    
    fn main(): void {
        let S = 1;
        let x: S;
               ^ struct
    
        println(S);
                ^ local variable
    }
  5. With the modified declaration lookup logic, the bodies of the struct declarations can also be resolved. The second pass of StructDecl resolution should happen after the first pass of function resolution.

    Structs are stored on the stack, passed to and returned from functions by value. To allocate a struct, its size must be known at compile time. Because a struct is a list of values grouped together, its size is equivalent to the sum of the sizes of its fields.

    struct Color {      Stack:
      r: number,          r: size(number) ┐
      g: number,          g: size(number) ├ (+) size(Color)
      b: number,          b: size(number) ┘
    }

    A struct cannot have a field of itself as in that case to compute the size of the struct, the compiler has to know it already, but to know it, it has to compute it first.

    struct S {
      self: S      size(S) = size(self) = size(S) = ...
    }

    The semantic analyzer also has to catch the case, when the struct contains itself through one or more struct fields.

    struct S { x: S2 }
    struct S2 { y: S }
    
    size(S) = size(x) = size(S2) = size(y) = size(S) = ...

    A struct cannot have void fields either, as there is no way to create a value of void type.

    struct VoidField {
      f: void,
      ^ error: field cannot be void
    }

    Implement a logic that resolves the fields of the structs, detects void fields and self containing structs.

    struct S { s2: S2 }
    ^ error: struct 'S' contains itself
    
    struct S2 { s3: S3 }
    ^ error: struct 'S2' contains itself
    
    struct S3 { s: S }
    ^ error: struct 'S3' contains itself

    If a field type cannot be resolved, it should raise an error.

    struct S { 
      e: Error
      ^ error: unable to resolve 'Error' type of field
    }
  6. A StructInstantiationExpr as its name suggests, is only allowed to instantiate a struct. Instantiating anything else should result in an error.

    let S = 1;
    S { ... };
    ^ error: 'S' is not a struct type
    
    Undefined { ... };
    ^ error: 'Undefined' is not a struct type

    Upon instantiating a struct, every field must be initialized, but only declared fields can be initialized. Also, the same field cannot be initialized more than once.

    struct Color {
      r: number,
      g: number,
      b: number,
    }
    
    fn main(): void {
      Color {
      ^ error: field 'b' is not initialized
        r: 255,
        r: 255,
        ^ error: field 'r' is already initialized
        g: 255,
        a: 0,
        ^ error: 'Color' has no field named 'a'
      };
    }

    The type of the initializer must match the type of the field.

    struct Wrapper { c: Color }
    
    fn main(): void {
      Wrapper {
      ^ error: field 'c' is not initialized
        c: 0
           ^ error: 'number' cannot be used to 
             initialize a field of type 'Color'
      };
    }

    Last but not least, a ResolvedStructDecl cannot be referenced on its own. In such cases, it's safe to assume, the developer forgot to instantiate it.

    let s = Color;
            ^ error: expected an instance of 'Color'

    Update the semantic analyzer so that all of the mentioned use cases are recognized and their correctness is checked. The resolved nodes to produce, with their constructors omitted can be found below.

    struct ResolvedFieldInitStmt : public ResolvedStmt {
      const ResolvedFieldDecl *field;
      std::unique_ptr<ResolvedExpr> initializer;
    
      ...
    
      void dump(size_t level = 0) const override {
        std::cerr << indent(level) 
                  << "ResolvedFieldInitStmt: @(" 
                  << field << ") " << field->identifier
                  << '\n';
      
        initializer->dump(level + 1);
      }
    };
    
    struct ResolvedStructInstantiationExpr : public ResolvedExpr {
      const ResolvedStructDecl *structDecl;
      std::vector<std::unique_ptr<ResolvedFieldInitStmt>> 
        fieldInitializers;
    
      ...
    
      void dump(size_t level = 0) const override {
        std::cerr << indent(level) 
                  << "ResolvedStructInstantiationExpr: @("
                  << structDecl << ')' << '\n';
      
        for (auto &&fieldInit : fieldInitializers)
          fieldInit->dump(level + 1);
      }
    };

    The order of the field initializers should be the same as they are written in the source code. Also don't forget to run the constant expression evaluator on the initializer expressions.

    ResolvedStructInstantiationExpr: @(...)
      ResolvedFieldInitStmt: @(...) r
        ResolvedNumberLiteral: '1'
        | value: 1
      ResolvedFieldInitStmt: @(...) g
        ResolvedNumberLiteral: '2'
        | value: 2
      ResolvedFieldInitStmt: @(...) b
        ResolvedNumberLiteral: '3'
        | value: 3
  7. The semantic of a MemberExpr is straightforward. It is only allowed to access the existing fields of a struct instance.

    fn main(): void {
      let n = 0;
      n.x;
      ^ error: cannot access field of 'number'
    
      let c = Color { r: 1, g: 2, b: 3 };
      c.y;
       ^ error: 'Color' has no field called 'y'
    }

    To make the semantic analyzer fully support the new AST nodes, make sure this expression is also resolved to the node below.

    struct ResolvedMemberExpr : public ResolvedAssignableExpr {
      std::unique_ptr<ResolvedExpr> base;
      const ResolvedFieldDecl *field;
    
      ...
    
      void dump(size_t level = 0) const override {
        std::cerr << indent(level) 
                  << "ResolvedMemberExpr: @(" 
                  << field << ") " 
                  << field->identifier << '\n';
      
        base->dump(level + 1);
      }
    };
  8. In the AST there is a restriction on what kind of expressions can be found on the LHS of an assignment. Propagate this restriction to the resolved tree too by using the corresponding resolved tree nodes and updating Sema, so that these nodes are produced correctly.

    struct ResolvedAssignableExpr : public ResolvedExpr { ... };
    
    struct ResolvedDeclRefExpr : public ResolvedAssignableExpr { ... };
    
    struct ResolvedMemberExpr : public ResolvedAssignableExpr { ... };
    
    struct ResolvedAssignment : public ResolvedStmt {
      std::unique_ptr<ResolvedAssignableExpr> assignee;
      ...
    };

    Similar to how this change modified the API of the AST in the parser, it modifies the API of the resolved tree here, which affects Codegen directly. To keep the resolved tree compatible with the current code generation logic, introduce a similar workaround to what was added to Sema in the previous section, but this time in Codegen::generateAssignment().

    Because ResolvedMemberExpr can appear on the LHS of an assignment, the statement can now have side effects on both sides. To set an order of evaluation, the RHS of the assignment is evaluated before the LHS. Make sure this is also reflected in the resolution process. Resolve the RHS before the LHS.

    Design Note

    The evaluation order of assignments is also something that differs in every language. Kotlin evaluates the LHS first, while in C++ it's always the RHS first. Finally, there is C, which doesn't specify an order, it can be either of the two possibilities.

    Still, explicitly specifying an evaluation order lets the developers know in what order they should expect side effects if there are any.

    The ResolvedMemberExpr on the LHS should also be part of the CFG, but only this node, the ResolvedDeclRefExpr still shouldn't be added. The member expression can be applied to any primary expression, so make sure to add the ResolvedStructInstantiationExpr to the CFG too.

    fn foo(): void {
      S{ x: 2 }.x = 1;
      
      let n: number;
      n = 3;
    }
    
    foo:
    [2 (entry)]
      preds: 
      succs: 1 
    
    [1]
      preds: 2 
      succs: 0 
      ResolvedNumberLiteral: 1
      ResolvedNumberLiteral: 2
      ResolvedStructInstantiationExpr: S{ x: 2 }
      ResolvedMemberExpr: S{ x: 2 }.x
      ResolvedAssignment: S{ x: 2 }.x = 1
      ResolvedDeclStmt: n
      ResolvedNumberLiteral: 3
      ResolvedAssignment: n = 3
    
    [0 (exit)]
      preds: 1 
      succs:

    During variable initialization analysis, a struct is only considered initialized, if all of its members have been initialized in the same statement. Memberwise initialization is not supported.

    fn structInitAnalysis(): void {
      let initialized = S { x: 1 };
      initialized.x;
    
      let lateInit: S;
      lateInit = S { x: 1 };
      lateInit.x;
    
      let uninit: S;
      uninit.x = 1;
      ^ error: 'uninit' is not initialized
    }

    The mutability of a field is defined by the mutability of the base declaration. Temporary structs are always mutable.

    fn structMutability(): void {
      S { x: 1 }.x = 1;
    
      var mutable = S { x: 1 };
      mutable.x = 2;
    
      let immutable = S { x: 1 };
      immutable.x = 1;
                  ^ error: 'immutable' cannot be mutated
    }

Structs in LLVM

At this point, structs can be parsed and validated by the semantic analyzer. This section introduces how to work with them in LLVM IR and see all the previous changes put into a platform-specific executable.

  1. Before getting started with code generation make sure to update Codegen to accept a list of ResolvedDecl as the resolved tree instead of the current list of ResolvedFunctionDecl and remove the temporary adapter from the driver that was introduced in the previous section.

  2. The first step towards seeing the new syntactical changes in action is to represent each resolved struct as a struct type in LLVM IR.

    LLVM supports two kinds of structs, identified and literal ones. A literal struct comes in handy for languages that have built-in support for working with tuple types, or when implementing platform-specific optimizations. See the design note on that at the end of this section.

    fn returnTuple(): (number, number) { ... }

    The snippet above could be represented by the following piece of LLVM IR, where { double, double } is a literal struct of two double members.

    define { double, double } @returnTuple() { ... }

    Identified structs on the other hand are useful to represent a concrete pack of values, like user-defined structs or classes.

    struct Color { r: number, g: number, b: number }

    The Color struct in LLVM IR would be an identified struct of three double fields.

    %struct.Color = type { double, double, double }

    It's important to note that while literal structs are structurally unique, meaning that every { double, double } in the IR refers to the same type, identified structs are always different, so even if the IR contains %struct.a = type { double, double } and %struct.b = type { double, double }, %struct.a is not the same type as %struct.b.

    An identified struct can also be opaque, which tells LLVM that the struct exists, but it is not known what fields it has, like a forward declaration in C or C++.

    In this step every ResolvedStructDecl should be represented with an opaque identified llvm::StructType. Every time a struct type is referenced, it's this struct type that should be returned, don't generate a new llvm::StructType multiple times for the same struct.

    %struct.Color = type { double, double, double }
    
    define void @allocateColor() {
    entry:
      %c = alloca %struct.Color, align 8
      ret void
    }

    Similar to what happens in the semantic analyzer, first, a type for every struct declaration needs to be generated, because they can be referenced by other field or function declarations. Remember that the resolved tree is guaranteed to list every ResolvedStructDecl before any other declaration, so there is no need to implement any complex logic in this step.

  3. After every struct declaration has been given its own opaque type in the IR, each field can be represented with an existing type.

    Turn the currently opaque structs into complete struct declarations by generating and setting their bodies.

    %struct.S = type opaque
    %struct.S2 = type opaque

    The above structs with their bodies correctly set might look like the ones below.

    %struct.S = type { %struct.S2 }
    %struct.S2 = type { double, double }
  4. With struct types being supported, it's possible to generate code for struct instantiations. In LLVM the way to instantiate structs is to allocate space for them first and then initialize each member separately.

    Color { r: initR(), g: initG(), b: initB() }

    When generating code for a struct instantiation expression, the space that stores the struct must be allocated explicitly, only then can the fields of this explicitly allocated struct be initialized to the given values. Field initialization happens in declaration order, but the side effects must be visible in instantiation order.

    To make the previous specification easier to understand, take a look at this example that shows the steps of how a struct instantiation would be generated in LLVM IR.

    Declaration:
      struct Color { r: number, g: number, b: number }
    
    Instantiation:
      Color { g: initG(), r: initR(), b: initB() }
    
    Generated logic:
      let tmp: Color;
      
      let gInit = initG();
      let rInit = initR();
      let bInit = initB();
      
      tmp.r = rInit;
      tmp.g = gInit;
      tmp.b = bInit;

    Allocating a new variable is straightforward, the resulting IR will look like this.

    %Color.tmp = alloca %struct.Color, align 8

    Accessing a field of a struct instance, however, was not necessary until now. In LLVM IR, the field of a struct is accessed by its index rather than by its name, which is also the reason for storing the index of the field in a ResolvedFieldDecl.

    The field is accessed with the GetElementPointer instruction. This instruction can be confusing at first, so it's recommended to let the llvm::IRBuilder::CreateStructGEP() helper create it. This helper takes the type of the struct, a pointer to the beginning of the struct and the index of the element that needs to be accessed.

    Design Note

    To learn more about the GEP instruction, the LLVM documentation has a dedicated page, which explains it in full detail and why it was designed this way.

    Initializing tmp.r means getting the pointer to the r field and assigning the initial value to it.

    %0 = getelementptr inbounds %struct.Color, %struct.Color* %Color.tmp, i32 0, i32 0
    store double 1.000000e+00, double* %0, align 8

    The full LLVM IR for instantiating the Color struct at the top of this point can be found below.

    %Color.tmp = alloca %struct.Color, align 8
    %0 = call double @initR()
    %1 = call double @initG()
    %2 = call double @initB()
    %3 = getelementptr inbounds %struct.Color, %struct.Color* %Color.tmp, i32 0, i32 0
    store double %0, double* %3, align 8
    %4 = getelementptr inbounds %struct.Color, %struct.Color* %Color.tmp, i32 0, i32 1
    store double %1, double* %4, align 8
    %5 = getelementptr inbounds %struct.Color, %struct.Color* %Color.tmp, i32 0, i32 2
    store double %2, double* %5, align 8

    Note that the example struct deliberately has only number fields. Assigning values to struct fields works differently, which will be discussed later.

  5. In the previous point, the fields of a struct have already been accessed, so it's the perfect time for lowering a ResolvedMemberExpr.

    To read the field of a struct, a ResolvedDeclRefExpr that references the struct needs to be evaluated at some point. While numbers are loaded into a register when referenced, structs are not loaded into registers, as they might not fit, so it must be ensured that a ResolvedDeclRefExpr referencing a struct always results in a pointer to that struct without being followed by a load instruction.

    %Color.tmp = alloca %struct.Color, align 8
    ...
    %3 = ..., %struct.Color* %Color.tmp, ...

    A member access results in a GetElementPointer instruction on the result of the base expression, however, there is a difference in how it is followed up depending on whether the field is being written or read.

    When the value of a numeric field is being read, it has to be loaded from memory.

    %3 = getelementptr inbounds %struct.Color, %struct.Color* %Color.tmp, i32 0, i32 0
    %4 = load double, double* %3, align 8

    On the other hand, if the field is being assigned, the store instruction expects a pointer operand, so the field must not be loaded.

    %3 = getelementptr inbounds %struct.Color, %struct.Color* %Color.tmp, i32 0, i32 0
    store double 1.000000e+00, double* %3, align 8
  6. At this point creating a temporary struct and reading one of its fields immediately should work correctly, but it would also be convenient to store them in variables, initialize their struct fields or assign a different value to a struct variable.

    All of the above use cases are handled by the same logic. As mentioned in a previous step, the difference between structs and numbers is that structs are not loaded into registers, so the store instruction cannot be used to mutate their values. Any time a struct is assigned, the value of each field must be manually assigned to the same field of the destination struct.

    Statement:
      let s = Color { r: 1, g: 2, b: 3 };
    
    Generated logic:
      let tmp: Color;
      tmp.r = 1;
      tmp.g = 2;
      tmp.b = 3;
      
      let s: Color;
      s.r = tmp.r;
      s.g = tmp.g;
      s.b = tmp.b;

    Manually assigning every field, however, would result in duplicating the work and generating large LLVM IR fragments. The goal is to make the content of the variable storing the struct on the LHS the same as the struct on the RHS. To achieve this easily, LLVM has a builtin memcpy intrinsic, which copies a specific amount of memory from one place to another. With that, an assignment between 2 struct variables can be simplified to only a few lines.

    %3 = bitcast %struct.Color* %s to i8*
    %4 = bitcast %struct.Color* %Color.tmp to i8*
    call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %3, i8* align 8 %4, i64 24, i1 false)

    Note that this is the version of memcpy created by the llvm::IRBuilder::CreateMemCpy() helper. To use this helper, the alignment and the size of the struct must be calculated first, which can be done easily with the help of the llvm::DataLayout utility.

    Design Note

    Intrinsics are built-in functions in LLVM, that let the backends know what the desired effect is at that specific point of the IR. The backends then use this information to find the most efficient way to achieve it.

    More intrinsics can be found in the LLVM Language Reference Manual.

    The full LLVM IR for the variable declaration can be seen below.

    %s = alloca %struct.Color, align 8
    %Color.tmp = alloca %struct.Color, align 8
    
    %0 = getelementptr inbounds %struct.Color, %struct.Color* %Color.tmp, i32 0, i32 0
    store double 1.000000e+00, double* %0, align 8
    %1 = getelementptr inbounds %struct.Color, %struct.Color* %Color.tmp, i32 0, i32 1
    store double 2.000000e+00, double* %1, align 8
    %2 = getelementptr inbounds %struct.Color, %struct.Color* %Color.tmp, i32 0, i32 2
    store double 3.000000e+00, double* %2, align 8
    
    %3 = bitcast %struct.Color* %s to i8*
    %4 = bitcast %struct.Color* %Color.tmp to i8*
    call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %3, i8* align 8 %4, i64 24, i1 false)

    Design Note

    Upon taking a look at how Clang generates code for an equivalent snippet, the result will most likely be different. If the values of the initializers are known in compile time, Clang creates a constant struct and a memcpy intrinsic instead, so the backends can choose the most optimal way to initialize the struct in machine code.

    @__const.clangKnownInitializers.c = private unnamed_addr constant %struct.Color { i32 1, i32 2, i32 3 }, align 4
    
    define void @clangKnownInitializers() {
    entry:
      %c = alloca %struct.Color, align 4
      call void @llvm.memcpy.p0.p0.i64(ptr align 4 %c, ptr align 4 @__const.clangKnownInitializers().c, i64 12, i1 false)
      ret void
    }

    Another optimization Clang performs is copy-elision. If copying the struct has no side effect, there is no need to create a temporary first and emit a memcpy. The initializer values can be stored directly in the space of the local variable.

    define void @copyElision(i32 %0) {
      %2 = alloca i32, align 4
      %3 = alloca %struct.Color, align 4
      store i32 %0, i32* %2, align 4
    
      %4 = getelementptr inbounds %struct.Color, %struct.Color* %3, i32 0, i32 0
      %5 = load i32, i32* %2, align 4
      store i32 %5, i32* %4, align 4
      %6 = getelementptr inbounds %struct.Color, %struct.Color* %3, i32 0, i32 1
      %7 = load i32, i32* %2, align 4
      store i32 %7, i32* %6, align 4
      %8 = getelementptr inbounds %struct.Color, %struct.Color* %3, i32 0, i32 2
      %9 = load i32, i32* %2, align 4
      store i32 %9, i32* %8, align 4
    
      ret void
    }
  7. By this point, structs should be fully functional within the body of a function. The last thing to support is passing them to and returning them from other functions.

    Structs in your language are passed to and returned from functions by value, however, as mentioned previously, in LLVM IR only the pointer to the struct can be moved around, it's not possible to load them and pass the loaded value. Passing by value, however, only means that no modification to the parameter is visible in the caller, and this can be achieved in LLVM IR easily.

    For mutable parameters, the trick is to create a copy of the struct in the caller and pass a pointer to that copy.

    %c = alloca %struct.Color, align 8
    %struct.arg.tmp = alloca %struct.Color, align 8
    ...
    %5 = bitcast %struct.Color* %struct.arg.tmp to i8*
    %6 = bitcast %struct.Color* %c to i8*
    call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %5, i8* align 8 %6, i64 24, i1 false)
    
    call void @mutableStructParam(%struct.Color* byval(%struct.Color) %struct.arg.tmp)

    Notice how the byval attribute is added to the function. It is a way to let LLVM know that the particular parameter simulates a pass-by-value semantic and when it does that, this attribute must be present at both the function declaration and the call site. Note that byval also requires the copy to be created in the caller.

    define void @mutableStructParam(%struct.Color* byval(%struct.Color) %c) {
      ...
    }
    
    call void @mutableStructParam(%struct.Color* byval(%struct.Color) %struct.arg.tmp)

    When the parameter is immutable, there is no need to copy the struct, as there are no modifications that can be visible in the caller. To indicate that the struct will never be modified inside the function through the passed pointer, the readonly attribute can be used. This attribute is not compulsory though, and can be omitted, but as with every other attribute, it must also be present at both the function declaration and the call site.

    %c = alloca %struct.Color, align 8
    ...
    call void @immutableStructParam(%struct.Color* readonly %c)

    Returning the struct is also done through a dedicated parameter. If the function returns a struct by value, a temporary variable that stores the returned value is allocated in the caller, and the pointer to this variable is passed as the first argument of the function, marked by the compulsory sret attribute. The return statement then copies the struct to the space allocated in the caller, and the function is treated as if it were a void function.

    define void @caller() {
    entry:
      %struct.ret.tmp = alloca %struct.Color, align 8
      call void @returnStruct(%struct.Color* sret(%struct.Color) %struct.ret.tmp)
      ret void
    }
    
    define void @returnStruct(%struct.Color* sret(%struct.Color) %ret) {
    entry:
      %Color.tmp = alloca %struct.Color, align 8
      ...
      %3 = bitcast %struct.Color* %ret to i8*
      %4 = bitcast %struct.Color* %Color.tmp to i8*
      call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %3, i8* align 8 %4, i64 24, i1 false)
    
      br label %return
    
    return: 
      ret void
    }

    Design Note

    Sadly, using sret and byval arguments is not an all-round solution. Different platform-specific ABIs might require different ways of passing and returning struct values.

    struct Coerced {
      int x;
      int y;
    };
    
    Coerced passCoerced(Coerced c) {
      return c;
    }

    In the previous example, on the x86-64 platform, Coerced has the size of 64 bits and it fits into a single register, so Clang lowers passCoerced as a function that takes and returns an i64 value to conform to the ABI.

    define i64 @passCoerced(i64 %c.coerce) {
    entry:
      %retval = alloca %struct.Coerced, align 4
    
      %c = alloca %struct.Coerced, align 4
      store i64 %c.coerce, ptr %c, align 4
      
      call void @llvm.memcpy.p0.p0.i64(ptr align 4 %retval, ptr align 4 %c, i64 8, i1 false)
      %0 = load i64, ptr %retval, align 4
      ret i64 %0
    }

    Larger structs might be passed and returned in 2 separate registers, where the latter is expressed with returning a literal struct.

    %struct.LargerCoerced = type { i32, i32, i32, i32 }
    
    define { i64, i64 } @passLargerCoerced(i64 %l.coerce0, i64 %l.coerce1) {
    entry:
      %retval = alloca %struct.LargerCoerced, align 4
      ...
      %2 = load { i64, i64 }, ptr %retval, align 4
      ret { i64, i64 } %2
    }

    It's the compiler's responsibility to ensure that the generated code conforms to the ABI of the target platform and thus maintains compatibility.

    Implementing full support for a specific ABI alone could also have its dedicated guide, so it's not in the scope of this exercise. It would only become necessary if making your language interoperable with C, C++, or any other language that compiles to native code, was also a requirement. The only external function used by your language, that was compiled by a different compiler is printf, but it neither receives nor returns a struct, so compatibility issues will not arise.

    For more details about the topic, see some discussions on LLVM mailing lists, like the one about struct parameters being converted to other types or the one on passing structures in registers.

The End

Congratulations on reaching the end of this exercise, the compiler now fully supports structs in your language!

Thank you for taking the time to read through this guide on how to write a compiler with LLVM! Hopefully, it was time well spent and an interesting experience to explore the different fields of language design, parsing, static code analysis, and code generation throughout the previous few pages.