Compilers (2023)

Welcome to CS327: Compilers. In this course, we will design and implement a compiler for a simple imperative language. We will aim for the following language features.

We will aim to implement a tree-walking interpreter followed by designing a bytecode along with a source to bytecode compiler that implements some optimizations such as function inlining, loop unrolling, constant folding, and dead code elimination.

My code for this course will be hosted at the github repository for CS327: Compilers (2023).

Current status #

Complete implementation for a PL with conditionals, loops, nested functions, and typechecking with both tree-walking and bytecode generation.

Project #

This section lists the goals of the project. Everyone should try to complete the minimal (M) goals. The goals marked intermediate (I) and advanced (A) can be taken up once the minimal goals are achieved.

A showcase of various programming languages designed by students.

Meetings #

21-04-2023 #

Resources for mathematical foundations. Feedback.

19-04-2023 #

Static links for accessing variables in outer functions.

12-04-2023 #

Functions with parameters. Calling convention. Function preamble.

05-04-2023 #

Parameter-less functions. Call frame management. CALL and RETURN instructions.

31-03-2023 #

Frames and local variables.

29-03-2023 #

Code walk-through for bytecode generation. See ‘unstable’ branch in the git repo. Expressions without variables are done.

22-03-2023 #

Exam II

17-03-2023 #

Bytecode generation for conditionals and loops.

15-03-2023 #

Stack VM, Bytecode generation for arithmetic expressions.

03-03-2023 #

Importance of formalizing passes. Parser generators (GNU Bison, Python parsimonious). Programming language shootout game. Code golf.

01-03-2023 #

Arrays vs Lists performance. Implementing arrays. Referential transparency. Copy-on-Write.

First-class functions and closures. Using heap to implement closures.

24-02-2023 #

Implementing variable resolution for proper behaviour.

22-02-2023 #

Pitfalls of Python style extreme late binding of symbols.

15-02-2023 #

Implementing function definitions and function calls.

13-02-2023 #

Exam I sheets loaned. Discussed project evaluation.

10-02-2023 #

We discussed name visibility of functions in various languages. Late binding in Python, forward declarations in C, and explicit syntactic form for mutual recursion in OCaml.

04-02-2023 #

Exam I.

03-02-2023 #

We implemented an imperative environment that handles scoping of mutable variables.

01-02-2023 #

We discussed how to write environments for handling mutable variables correctly.

27-01-2023 #

We discussed operator precedence, associativity, and how to write a grammar that captures these concepts. We then saw how one can write a parser for such grammars.

25-01-2023 #

We discussed parsing. Recursive descent parsing peeks at the first token and determines how to parse the string based on that. We discussed ambiguity in expression parsing and how recursive descent fails for arithmetic expressions in infix form.

20-01-2023 #

We discussed lexing where we produce a stream of tokens from a stream of characters (also known as source code).

18-01-2023 #

We discussed static analysis – analysis of programs without evaluating them. A part of it is semantic analysis which involves type checking. We discussed how to perform typechecking by tree-walking. We discussed language design choices such as expression+statements or expression-only languages and statically-typed vs dynamically-typed languages.

16-01-2023 #

We discussed project submission workflow. All individual contributions SHOULD be associated with their github (or equivalent) id. As a rough guideline for project goals, try to implement solutions for the first five questions from Project Euler in the language you are implementing.

13-01-2023 #

We implemented let expressions in our interpreter by keeping track of environments that map names into values.

11-01-2023 #

We learned lexical scoping and let bindings that introduce lexically scoped bindings.

09-01-2023 #

We wrote code for an arithmetic expression tree evaluator. We discussed internal representation of programs – Abstract Syntax Trees (ASTs). We then defined an AST suitable for a calculator in Python and wrote a tree-walk eval() for it.

04-01-2023 #

Attendance is not mandatory. Grading will be divided into theory exams (50) and team project (50). There will be three assessments for both and the split will be (15 + 15 + 20). Each team may have four to five members including a team leader. You can choose any programming language. The history of the project should be tracked via git and commits should identify contribution. Before assessment, each team has to submit information regarding work done by each member. This will be compared against git commits and performance in TA evaluations. Theory exams will mostly be subjective and will test your understanding of concepts.

You can learn about using git via this github tutorial. Each team member should have their own fork of their project where they make their commits. Individual changes can be pushed to a main repository that will aggregate all contributions into the group’s output.

Python dataclasses, Python match statement, and pytest will be extensively used by me during lectures. For your project, feel free to use your preferred tools.

References #

Showcase #

This section showcases various programming languages designed by students. The programs are solutions to Project Euler problem 14.

dino #

assign start = 3;
assign count = 1;
assign ans = 1;
assign max_count = 1;
assign dp = {1:1};

func collatz(n)
    ?echo(n);
    if(dp.in_dict(n))
        return dp[n];
    end
    if(n%2 == 0)
        dp[n] = collatz(n//2) + 1;
    end
    else
        dp[n] = collatz(3*n + 1) + 1;
    end
    return dp[n];
end


loop(start < 100)
    ? echo(start);
    count = collatz(start);
    if(count > max_count)
        max_count = count;
        ans = start;
    end
    start = start + 2;
end

echo(max_count);
echo(ans);

gossip #

declare n = 1;
declare k = 100000;
declare len = 1;
declare i = 1;
declare max = 0;
declare num = -1;

declare d = dictn(0);

while i <= k do {
    assign n = i;
    assign len = 1;

    while n > 1 do {

        if d[n]!=0 then {
            assign len = len + d[n] -1;
            assign n = 1;
        }
        else {

            if n % 2 == 0 then {
                assign n = n / 2;
                assign len = len + 1;}
            else {
                assign n = 3 * n + 1;
                assign n = n/2;
                assign len = len + 2;
            };

        };
    };

    assign d[i] = len;

    if len > max then {
        assign max = len;
        assign num = i;}

    assign i = i + 1;
};

print(k);
print(num);
print(max);

notpy #

var i = 2;
var a = [1000000];
var max=0;
var maxnum=0;

while (i<1000000){
    var start = i;
    var cnt =1;

    while (start !=1){
        if (start < 1000000){
            if (a[start] != 0){
                cnt = cnt + a[start]-1;
                start = 1;
            }
            else{
                if (start %2 == 0){
                    start = start/2;
                }
                else{
                    start = 3*start + 1;
                }
                cnt = cnt+1;
            }
        }

        else{
            if (start%2==0){
                start=start/2;
            }
            else{
                start=3*start+1;
            }
            cnt=cnt+1;
        }

    }
    if(cnt>max){max=cnt; maxnum=i;}
    a[i] = cnt;
    print i, cnt;
    i=i+1;
}
print "ans", max, maxnum;

thorium #

let ans=0 in
let start=0 in
let num = 1 in
{
    while num<=100000 do
    {
            let l=1 in
            let n= num in
            {
                while n>1 do
                    {
                        if n%2==0 then {
                            n=n//2;
                        };
                        else{
                            n=3*n+1;
                        }; end
                        l=l+1;

                    };

                    if l>ans then {
                        ans= l;
                        start=num;
                    };
                    else{
                       ans=ans;
                    }; end


            };

            num=num+ 1;

    };

    print ans;
    print start;
}
end

Zebra #

var int l = 1000000;
var array(int) a=array(int){};
var int mem = 1000000;

for(var int i=0;i<mem;i=i+1){
    append(0,a);
}

var int maxChainLength = 0;
var int result = 0;
var int n = nil;
var int chainLength = nil;
var int flag = 0;

for (var int i = 1; i < l; i=i+1) {
    n = i;
    chainLength = 1;
    flag = 0;

    while (n > 1)
    {
        if (n<mem){
            if (a[n]!=0){
                chainLength = chainLength + a[n] -1;
                flag = 1;
                n = 0;
            }
        }

        if (flag ==0){

            if (n % 2 == 0) {
                n = n//2;

            }
            else {
                n = 3 * n + 1;
            }
            chainLength = chainLength + 1;
        }
    }

    if (chainLength > maxChainLength) {
        maxChainLength = chainLength;
        result = i;
    }
    if (i<mem) {a[i]=chainLength;}
}


zout("result", result);
zout("len", maxChainLength);

Zoro #

var mil <- 1000;
var lst <- [1];
var i <- 2;
var maxl <- 1;
var maxi <- 1;
var temp <- 0;
var j <- 0;

while i <= mil
do
    if i%2 == 0
    then
        temp <- 1 + lst.at (i//2 - 1);
        lst.push temp;
    else
        j <- 3*i + 1;
        temp <- 1;
        while j >= i
        do
            if j % 2 == 0
            then
                j <- j // 2;
            else
                j <- 3*j + 1;
            endif;
            temp <- temp + 1;
        endwhile;
        temp <- temp + lst.at (j - 1);
        lst.push temp;
    endif;
    if temp > maxl
    then
        maxi <- i;
        maxl <- temp;
    endif;
    i <- i+1;
endwhile;
print mil;
print maxi;
print maxl;