Saturday, July 14, 2012

Assembly Code Generation - part#4

This is the 4th post in the series of assembly code generation.

Part - I
Part - II
Part - III

In this post, I will try to describe how a method definition and dispatch might be translated to assembly language.

Methods also define a local "scope". Any global variables can be overriden in this scope and new ones(called local variables) can be defined here. Typically whenever a method call is made, we put a specific structure called "stack frame"(or "activation record") in the memory. In the "activation record" method parameters, local variables, return address etc are kept at known positions. Typically parameters and local variable values are kept at fixed offsets from a know position called the "frame pointer" in the activation record.

One of the main task while emitting code for method definition or dispatch is to design the strcuture of your activation record. Typically method dispatch fills part of the activation record and remaining is filled by the code generated by method definition.


We will use register \$fp for frame pointer and \$ra for return address, to hold them whenever required.

At this point I will introduce a few more invariants..

1. Inside a method register \$s0 always has the address to self object .


So, here is the structure of my activation record.

-------------------------
argument # n
-------------------------
argument # n-1
-------------------------
..
..
argument # 1
------------------------- <-- I keep \$fp to have this location
fixed space reserved
..
..
for local variables
-------------------------
return-address
-------------------------

One more thing we need to understand is the structure of something called, a "dispatch table". Remember, in part-III of this series, I described the structure of the prototype objects, which keep a pointer, dispatch table pointer, at offset 8.
The dispatch table is nothing but a list of all the methods visible(defined in the class and the ones defined in the parent and its parent and so on.) in the class of the prototype object.

Let us look at one method table...

class A inherits Object {

      foo(x:Int, y:Int): Int {
                 ..
      };

      --overriding copy
      copy():SELF {
                  ..
      }
     
}

Generate method table..

A_dispTab:
  .word  Object.abort
  .word  Object.type_name
  .word  A.copy
  .word  A.foo

Notice the ordering, they are listed from super type to the base type in the order of their definition. Notice A.copy appears before A.foo because copy() is defined in base class Object, before foo(..) is defined in A. Compiler will generate dispatch tables like this for all the classes defined in the program. And labels like A.foo are generated where we generate the code for method definition for foo in A.


Let us see how a method dispatch might translate to assembly...

Original code..
x.foo(param1,param2);

Assembly code generated...

#push all the argument to the stack..

  #generate code to evaluate param2
  #that should put param2 result in \$a0
  cgen(param2)
  sw    \$a0     0(\$sp)
  addiu \$sp     \$sp     -4

  #generate code to evaluate param1
  #that should put param1 result in \$a0
  cgen(param1)
  sw    \$a0     0(\$sp)
  addiu \$sp     \$sp     -4

  #emit code to evaluate x from x.foo(..) that
  #should put x evaluation result object pointer
  #in \$a0
  cgen(x)
 
  #after pushing the arguments in memory
  #locate the label for method foo and jump
  #there

  #Object returned by x will have dispatch table pointer
  #located at offset 8, in the dispatch table label A.foo can
  #be found at a fixed offset known at compile time
  #here we just jump to that location
 

Let us now see, how the method definition for A.foo translates to assembly..

#First the label is generated
A.foo:
  #caller has pushed the arguments on stack
  #store location of current stack pointer
  #in the frame pointer
  move    \$fp   \$sp
 
  #evaluate at compile time, how much
  #space you'll need for local variables
  #or temporaries and reserve it
  addiu    \$sp  \$sp 

  #generate code to evaluate method body
  cgen(body)

  #in the method body, arguments and locals
  #can be looked at from compile time known
  #offsets from frame pointer



  
  #load value of return address from stack frame
  #to register \$ra and jump there
  jump \$ra



Finally, for anyone reading these posts, I will highly recommend to take the online compiler class from coursera. And, to really learn it, you *HAVE TO* finish all the programming assignments.

Saturday, July 7, 2012

Assembly Code Generation - part#3

This is the 3rd post in the series of assembly code generation.

Part - I
Part - II

In this post, I will try to describe how an object oriented language might implement the "new" operator for creating new instances of some class.

I will describe here how the COOL compiler that I wrote during the compiler course worked.

Some trivia about inheritance in COOL language: A class Foo can inherit from another class Bar, all the global attributes present in Bar are also visible in Foo and can not be redefined. However, methods defined in Bar, also visible in Foo,  can be overwritten in Foo.

New object creation works by creating clone of a "prototype object" for that type. COOL compiler generated code lays out one prototype object for each type defined in the program being compiled(which may contain multiple class definitions) and at runtime new objects for a class are created by cloning the prototype object for that class.

Here is how a COOL object is laid out in the memory..

----------------------- offset = 0
class-tag

----------------------- offset = 4
total-object-size

----------------------- offset = 8
ptr-to-dispatch-table

----------------------- offset = 12
inherited-attr-1

-----------------------
inherited-attr-2

-----------------------
..
-----------------------       
attribute-1

-----------------------
attribute-2

-----------------------
..

-----------------------


1st 4 bytes contain a unique integer tag assigned to this class by the compiler. This identifies the class of the object and used in various places, for example while checking object equality or to see if two objects are of same type or not.
Next 4 bytes contain the total size of this object.
Next 4 bytes contain a pointer to the dispatch table(a table of methods defined in this class and its parents, we will talk more about this in a separate post)

Remaining bytes contain all the attributes, including inherited. Let say the class hierarchy is..

B inherits A
C inherits B,

Basically A is on top in the hierarchy and C is in the bottom. Then C's prototype object will first contain all the attributes of A, then those of B and then those defined in C itself.

Here are some of the things you should notice about this object layout.

- Object is laid out in contiguous memory.

- The offset of an attribute remains same in a class and all of its subclasses because of the way we order attributes in the object.
This kind of layout basically lets the subclass extend the layout of base class than changing it fundamentally. Because of this fact it is simple to retrieve value of an attribute from a fixed offset in the object without knowing actual concrete dynamic type of the object at runtime.

Let us see the prototype objects generated for a simple hierarchy in COOL.

class A inherits Object {
      x:Int;
};

class B inherits A {
      y:Bool;
};

Code generated will look something like following...

Object_protObj:     #label to refer to this object    
    .word    0   #write word 0, class tag of Object class, to memory
    .word    3   #write size of this object in memory words
    .word    Object_dispTab  #write address of dispatch table of Object type in 1 word of memory
A_protObj:
    .word    1
    .word    4
    .word    A_dispTab
    .word    int_const0      #default value of x, pointer to integer object 0
B_protObj:
    .word    2
    .word    5
    .word    B_dispTab
    .word    int_const0      #default value of x, pointer to integer object 0
    .word   bool_const0      #default value of y, pointer to boolean object false



Now we are ready to understand the code generated for expression, new A.

cgen(new A)

  #load address of label A_protObj in register \$a0
  la \$t1 A_protObj
 
  #emit instructions to copy the
  #object at address \$t1, and to put
  #address to newly cloned object in \$a0


And you're done, new object of type A is created and pointer to it is placed in \$a0  :).

BONUS Reading:
COOL supports SELF TYPEs too.
Basically when we execute x.someMethod(..), Inside the definition of someMethod, there is a variable visible(called "self" in COOL and "this" in java) that refers to the object referenced by x at runtime. new SELFTYPE is supposed to return the object of the type of the one referenced by "self".

At this point, I should declare one more invariant. Code generated for method call/dispatch always ensures that register \$s0 has the pointer to "self" object.

To support, new SELFTYPE, COOL Compiler always generates a label called Class_objTab that has pointers to all the prototype objects in order of their class tags. So, for our example mentioned above, that label will look something like following..

class_objTab:   #the class_objTab label
    .word    Object_protObj    #first pointer to Object_protObj, as its class tag is 0
    .word    A_protObj         #next is A as its class tag is 1
    .word    B_protObj         #next is B as its class tag is 2

Now we are ready to see the code generated for, new SELFTYPE

cgen(new SELFTYPE)

         #load the address of label class_objTab in
         #temporary register \$t1
         la     \$t1     class_objTab

         #load the integer class-tag of self object
         #referenced by \$s0, class-tag is stored at
         #offset 0
         #class-tag is stored in temp register \$t2
         lw     \$t2     0(\$s0)

         #class_objTab contains the prototype object pointer
         #at offset class-tag of the type, so we add class-tag
         #stored in \$t2 to \$t1 get address of prototype object
         #of self object
         addu   \$t1     \$t1     \$t2

         #at this point \$t1 has the pointer to prototype object
         #of self object's type
         #we can emit code that copies the prototype object and
         #puts the pointer in \$a0
        

Assembly Code Generation - part#2

This is 2nd post in the series of assembly code generation.

Part - I

In this post, I am writing the possible assembly code generated from a typical if expression.

cgen(if e1 == e2 then e3 else e4)

  #generate code for e1 that will put
  #e1 evaluation in \$a0
  cgen(e1)
  sw \$a0 0(\$sp) #push the result to stack
  addiu \$sp \$sp 4

  #generate code for e2 evaluation, this will
  #will put result of e2 evaluation in \$a0
  cgen(e2)

  #load the result of e1 evaluation
  #from stack in temp register \$t1
  lw \$t1 4(\$sp)
  addiu \$sp \$sp 4

  #check if \$a0 and \$t1 are equal, jump to
  #label true_branch if they are equal or else
  #continue
  beq \$a0 \$t1 true_branch

  #we come to this this point only if value of
  #e1 and e2 evaluation were different, here
  #we generate code for else case
  cgen(e4)

  #after code for else branch, unconditionally jump  to label
  #endif which is the label generated in the
  #end
  jumpto endif

  #true_branch label, we come here if e1 and e2 evaluated
  #to same value
  true_branch:
  cgen(e3)

  #the label endif
  endif:

  #code beyond the if expression follows...

Friday, July 6, 2012

Assembly Code Generation - part#1

I recently finished doing the Compiler course from Coursera. In this series of posts I am trying to take some notes explaining how high level programming language expressions might be translated to assembly code. They are not accurate descriptions but more to get the idea.

In general, on any machine, you have a few registers and memory allocated to your process. Initially, execution of a program is under the control of the OS. When the process is started, OS allocates memory to the process and jumps to the entry point(e.g. "main"). Memory(from here on, memory means the memory allocated to your process) is typically divided in 2 sections, namely code and data area. Your instructions are stored in code area while the data is stored in the data area. It is responsibility of the compiler to generate code and orchestrate data area.

Usually, assembly code is generated by recursively traversing the AST. Assume every node in the AST implements a function called cgen() that is responsible for generating assembly code for the expression for that node in the AST.

In this post I will explain how a "+" expression might be implemented.

In general, when we generate code, we need to have some invariants so that various expression's generated code doesn't modify the memory or registers in an unexpected way.

For this post, following invariants are sufficient.

1. Any expression's generated assembly code puts the pointer to the result of the expression in accumlulator register, \$a0.
2. Register \$sp (stack pointer) always contains address of next free word where data can be written.
3. Any expression't generated code ensures that it preserves \$sp value that is at the end of expression evaluation code \$sp gets the same value back that it had when that expression's code execution started.


cgen(e1 + e2):

  #generate code to evaluate e1, this should
  #put the result of e1 evaluation in \$a0
  cgen(e1)

  #now that \$a0 has value of e1 evaluation, let us
  #store it on the stack
  sw \$a0 0(\$sp)
  addiu \$sp \$sp -4 #decreasing stack pointer as we "pushed" 4 bytes of data

  #now generate code to evaluate e2, that should
  #put the result of e2 evaluation in \$a0
  cgen(e2)

  #load the value of e1 evaluation in a temporary register \$t1
  lw \$t1 4(\$sp)
  addiu \$sp \$sp 4  #increasing stack pointer as we "popped" 4 bytes of data
  add \$a0 \$a0 \$t1  #add the values in \$a0 and \$t1, put the result in \$a0

  #finally at this point \$a0 holds the sum of results of e1 and e2 as required

Thursday, July 5, 2012

Finished Coursera Compiler course

Today I gave the final exam for compiler course and managed to get a decent score.

Course website says, "This course will discuss the major ideas used today in the implementation of programming language compilers, including lexical analysis, parsing, syntax-directed translation, abstract syntax trees, types and type checking, intermediate languages, dataflow analysis, program optimization, code generation, and runtime systems". And, the course was really able to cover the said topics in decent detail.

But, real learning of the course was not in the videos or quizzes or exams. It was in the optional project, where you were expected to code 4 major phases of a compiler namely lexical analysis, parsing, semantic analysis and assembly code generation. So, essentially, I wrote a fully functional compiler for COOL language that has many features found in Java and C++.

Programming assignment#1 was to write the lexical analyzer for COOL, which will take a COOL program and produce the Tokens. We were expected to use lexer generator, JLex, where you can give your token specifications using regexps and java code that creates required token on regex match. Though lectures taught you how actually regexes can be implemented by converting them to NFAs, which can be simulated directly or can be converted to DFAs and can have a table driven implementation, but use of JLex did not let me code that. Because of time pressure I could only do the assignment with JLex. I would like it even more if assignment is modeled in a way so that you don't use JLex but code the regex implementation by hand.

Programming assignment#2 was to write the parser that would generate the AST from the tokens returned by lexer from previous assignment. Again, we used a parser generator called Java CUP, that would generate the parser given the CFG grammar specification with associativity and precedence rules.Though you learn the theory of parsing(from lecture videos and readings), but assignment does not really need you to hand code one. I would like it even more if I hand code the parser as this is an academic exercise(however, I have earlier written a recursive-descent parser).

The fun part begins now..

Programming assignment#3 is when you have no tools. You have an AST and need to hand code all the semantic rules of the language. Most of the code you write here is to do static type checking. You traverse the whole AST, I used visitor pattern, at every node(which are different kind of expressions supported by the language, for example method dispatch, if expression, case expression, assignment etc). You ensure that types declared and inferred from the expression are in agreement with the type rules defined in the language. COOL supports SELF TYPES and that adds to the type checker complexity.

Programming assignment#4 gets even better. After you're done with compile time semantic rules check, you generate MIPS assembly code for the given COOL program. This is the place where you become one with the machine. Here you have to carefully convert all the high level expressions to assembly code. You never realize while coding in a high level language that when it comes to assembly then processor, mostly, only knows how to go about blindly executing the instruction sequence laid out in memory and occasionally make jumps to other memory addresses. It is just an eye opener to learn the tricks of translating high level language constructs to simple load, move, jump and branch instructions with few registers and memory.

While doing these assignment, I had to spend a lot of time with paper and pen to think hard and form a solution.. for example how and what should go in a stack frame when a method is called, how you locate its correct implementation and jump back to the return address after completion.
Another eye opener was that, once the assembly code is generated, then there are no "identifiers", all you have is the memory addresses in the form of labels and offsets to locations in your stack frame :).

Another side thing I noticed is that It is an essential skill to draw yourself away from the laptop, think in isolation and be able to write code on paper in a not really compilable version of your choice of implementation language.

Besides keeping me super busy outside day job, This compiler project has enlightened me. And, I am really really thankful to coursera. Having done this compiler for a statically typed language(and a interpreter for a lisp language, Scheme), after so many years of coding... Now, I feel like, I can "see through" the languages. And, The feeling is just amazing.