Project, Part 1

Writing Sam-Code

CS 212 - Spring 2004

Due: Monday, Feb 16, 11:00PM

Objectives

Learn about assembly language by writing SaM (Stack Machine) code.
Become familiar with SaM code. In later assignments, you will write programs to generate such code.
Begin the process of understanding how a compiler works.

Sam-Code

Like most programs, sam-code programs are a sequence of instructions. These instructions are described in the SaM Summary document. This assignment requires that you use just a subset of the instructions. You will be writing instructions that might occur as part of a single function, such as instructions for arithmetic expressions or variable assignments.

To calculate 10+20 using SaM, you need to think of the expression in postfix notation as 10 20 +. Postfix notation places an expression’s operator at the end of an expression. Knowing that + operates on two values means that you should store the first two values somewhere, such as your brain, and then add them after encountering the + operator. In sam-code, you would write this arithmetic operation as:

PUSHIMM 10 PUSHIMM 20 ADD STOP

If you step through this example in the simulator, you will see that when the simulator executes ADD, both 10 and 20 pop from the stack and SaM pushes the result of 30 into the cell with address 0. Since the 30 is the only remaining value in the stack, the simulator will report 30 as the answer. Note that SP will contain 1, the address one higher than the top of the stack. Isn’t that neat?

Now, try a program that performs 1 + (2*3):

PUSHIMM 1 PUSHIMM 2 PUSHIMM 3 TIMES ADD STOP

In this case, sam-code advantageously helps to avoid worrying about operator precedence because the operators appear after the operands.

Compilers

For the CS212 project this semester, you will be writing a program called a compiler, which translates one computer language into another. Your compiler will convert a Java-like language into sam-code. For example, suppose you were trying to write a compiler to convert the following Java snippet into sam-code:

int x , y; x = 10; y = 20; return ( x + y ); // returns 30

The next sections demonstrate how to translate such high-level languages into sam-code.

The Symbol Table: Writing sam-code requires a bit of work because of the variables. You must reserve space for each variable in the stack memory because sam-code does not have explicit variables of its own. To help keep track of the source program’s variables and their locations, you should manually draw a symbol table, as shown in the table below. A symbol table is a collection of the program’s variables plus an additional variable, rv (return variable), that stores the program’s result. In this case, rv would represent the result of x+y.

Variable	Address
rv	0
x	1
y	2

As your program calls functions and creates objects, parts of the stack are filled with data. For now, we will assume that we are converting code from a main method, so we will not change the FBR’s (Frame Base Register's) initial value of zero. When we implement functions in Part 3 of the project, we will adjust the FBR. We need not worry about the HP (Heap Pointer) until Part 4 of the project.

Allocating Variables: You need to allocate stack space for the source program’s variables. By leaving the first cell open for the rv, there is a place for SaM to store and return the value. The main variables are stacked on top of the first cell in the order in which they are declared. The address of a variable is its cell address on the stack. To encode the declaration statements of the source language (int x and int y), you need to move SP “up” the stack. You have two choices:

ADDSP c: This makes c addresses available on the stack. Because these cells indicate allocated memory, they are labeled as type M. SaM already initializes each cell to 0 upon starting.
PUSHIMM value: This instruction will allocate one memory location and initialize it with value. Typically, you should use 0 for value.

For our example, we use ADDSP. Since SP starts at 0, you can move the stack pointer up three cells with ADDSP 3, which leaves space for rv, x, and y. SP will now point to the next free cell (address 3). You may allocate the same amount of space use three PUSHIMM 0 statements. The following sam-code allocates space for three variables:

ADDSP 3 // Code to store and retrieve x and y values // Place holder for later code STOP

Accessing Variables: How do you write sam-code for the assignment statements in the earlier Java snippet? Also, how do you retrieve a variable's value? You have two options: absolute addressing and relative addressing. In absolute addressing, you do not need to worry about the mysterious FBR other than to avoid changing it from its initial value of 0. Relative addressing, using the FBR, will be useful when we introduce functions; each function will have its own frame. The FBR (Frame Base Register) is changed each time a function is called. For now, you should understand both types of addressing.

Absolute Addressing: The instructions STOREIND and PUSHIND (store indirect and push indirect) each use an absolute cell address. To store a value in the stack, use STOREIND, which takes a value from V_top and pushes it into the location specified by V_below. In your sam-code, use the following sequence of instructions:

PUSHIMM address: Push the location of the variable in the stack.
PUSHIMM value: Push the value you wish to store.
STOREIND: Store value into address.

For assigning to a variable, you always follow this same process: push the address, push the value, and store the value at the address. For example, to store the value of 10 in x at address 1, you would use the following code:

PUSHIMM 1 PUSHIMM 10 STOREIND

To retrieve a value from the stack, use PUSHIND, which takes an address from V_top and pushes a value from that location. In your sam-code, use the following sequence of instructions:

PUSHIMM address: Push the location of the variable in the stack.
PUSHIND: Push value that is stored in address.

For example, to retrieve the value of x, you would use the following code:

PUSHIMM 1 PUSHIND

The following sam-code performs the operations of the earlier Java code-snippet. As you read though the example, note the methodical way in which each variable is stored and retrieved.

// Absolute addressing ADDSP 3 // allocate rv,x,y; you may also PUSHIMM 0 three times PUSHIMM 1 // location to store x PUSHIMM 10 // value to give x STOREIND // store value of x PUSHIMM 2 // location to store y PUSHIMM 20 // value to give y STOREIND // store value of y PUSHIMM 0 // location to store result of x+y PUSHIMM 1 // push address of x PUSHIND // retrieve value of x PUSHIMM 2 // push address of y PUSHIND // retrieve value of y ADD // add values of x and y STOREIND // store value of x+y in rv ADDSP -2 // remove x and y from memory STOP // halt --> SaM should return 30

To fully understand this approach, step through the code very carefully in the simulator. In particular, keep track of the SP register. You may also wish to draw your own stack to keep track of cell entries. Note that this technique limits the generality of functions, since each function has a set of variables occupying a particular spot in memory. Early versions of FORTRAN (and most other early computer languages) used absolute addressing for variables.

Relative Addressing: By using the FBR to keep track of your current method call, you can keep your sam-code very general. Since we are assuming no other method than our main method, FBR stays at 0. Judicious use of the STOREOFF x command provides the best way to manage variables. First, you need to advance the SP by the number of variables that you have, including the rv. To store a variable’s value, you push the value and then move it to the correct position in the stack. For instance, since x is the first variable, you would enter PUSHIMM 10 and then STOREOFF 1. This instruction moves V_top (which is 10) into the cell with address of FBR+1 (which refers to x). A similar instruction is used for y.

After storing the variable values, you need to extract and add them. To extract a value, enter PUSHOFF k, which pushes the value stored in address FBR+k onto the top of the stack. Once you have both values pushed, you can then add them and move the result to the rv address. Here is the code for the Java snippet, this time using relative addressing:

// Relative addressing: ADDSP 3 // allocate rv, x, y PUSHIMM 10 // push the value to store for x STOREOFF 1 // store the value 10 in x PUSHIMM 20 // push the value to store for y STOREOFF 2 // store the value 20 in y PUSHOFF 1 // retrieve the value of x PUSHOFF 2 // retrieve the value of y ADD // x+y STOREOFF 0 // store the value of x+y in rv ADDSP -2 // remove x and y from the stack STOP // halt; the return value should return 30

Note how the commands for relative addressing are similar to those for absolute addressing. Both versions work fine, but relative addressing makes it easier to implement functions.

Tasks

You may use any of the following sam-code instructions. For this first assignment, these are the only instructions that you should use.

ALU Instructions: ADD, SUB, TIMES, DIV, NOT, OR, AND, GREATER, LESS, EQUAL

Stack Manipulation Instructions: PUSHIMM, DUP, SWAP, PUSHIND, STOREIND, PUSHOFF, STOREOFF, ADDSP

Control Instructions: STOP

In Tasks 1 and 2, the values T and F indicate logical values. Be sure to use logical operators (e.g., NOT, OR, AND) when working with logical values. The integer 0 is used to represent F and the integer 1 is used to represent T (although any nonzero value is treated as T by the logical operators).

Remember that for each task, the goal is not just to display the right answer when the code halts. The goal is to write code that matches the tasks.

Task 1: Write a simple sam-code program that does nothing but return F.

Task 2: Write a sam-code program that evaluates this Boolean expression:
¬(5 < (2*3)) or (¬ T and F).
Note that ¬ has higher precedence than 'and'; in other words, the last part should be evaluated as
((¬ T) and F) not as (¬ (T and F))

Task 3: Write a sam-code program to evaluate (1/(1+2/(1+3/(1+4/(1))))). Note that you must use DIV (integer division), not DIVF (floating-point division)..

Task 4: Write a sam-code program for the following Java snippet using only an absolute-addressing approach:

int w, x, y; x = 1; y = 4; w = x + y; y = w; return ( y <= x );

Task 5: Write a sam-code program for the code snippet in Problem 4 using only a relative-addressing approach.

Submission Guidelines

Write each task’s solution in a separate text file. For example, if we gave a Task 0, its solution file would look as follows:

// +-------------------------------------+ // | Part 1, Task 0 | // | L. Paul Chew | // | chew@cs.cornell.edu | // | 123456 | // +-------------------------------------+ // Write a program that returns 4: PUSHIMM 4 // push 4 STOP // stop execution

For this assignment, you should comment every single line of sam-code so that we know that you know what is happening.

The instructions for using the CMS (Course Management System) are reasonably straightforward. You must upload each of your four task-files separately; it should be easy to see how to do this. Make sure you correctly match your files to the tasks.

Late Policy

Late assignments receive a grade of zero. We strongly suggest checking that you have been entered into CMS. If you delay in forming a group with your partners, you may not be able to find them in time to submit your work, which can cost many points. So, form your groups early!

Grading

We will grade your assignment based on style (how well you followed our instructions, how neatly you arranged your solutions, how neatly you wrote your code, and how clear you made your comments) and correctness (how well do the programs work).