Have you ever heard the Java mantra, "write once, run everywhere?” So, how does that work? When you compile a Java program, you generate a binary file that contains Java Byte Code (JBC), a collection of instructions that resemble machine code. However, you cannot simply run the byte-code on your computer because the byte-code is written for a “virtual computer,” called the Java Virtual Machine (JVM). To run your actual Java program, most computers have a byte-code interpreter that converts each byte-code instruction into commands for the particular architecture that you use. By learning about the JVM, you can learn about how programs are compiled and executed on a computer.
We do not have the time to study the complete JVM, so we will be simulating the JVM with SaM. (You can see the JVM specification at http://java.sun.com/docs/books/vmspec/ if you're curious.) SaM is a software package developed by Professor Keshav Pingali as a simplified substitute for the JVM. SaM stands for Stack Machine. A stack machine is essentially a device that pushes and pops information from a stack (a stack is a data structure that stores items in “last-in, first-out” order). Since the JVM uses stacks to store information, SaM provides a relatively easy and graceful way to learn about the JVM and compilers.
You will write code in sam, SaM’s pseudo-assembly code. Sam-code mimics assembly code, which typically has the form op-code operand. For instance, the sam-code PUSHIMM 3 has the op-code PUSHIMM and operand 3. When SaM encounters this instruction, SaM puts the value 3 on top of its stack of data. As you work your way through this document you will learn more about SaM’s environment and internal structure, and more about sam-code instructions.
SaM has a variety of data types. For the first assignment, we focus on the types integer and memory, though we will use other types later. The limits of SaM’s integer type are identical to those of Java's integer type. Memory values are also integers but refer to locations in SaM’s memory as you will see in later sections of this document. SaM has no explicit boolean type; instead, 0 is used to represent False and 1 (or other nonzero) is used to represent True.
SaM has two areas of memory called Program and Stack:
SaM also has other smaller areas of main memory called registers to store counters that keep track of information about a program. SaM’s PC, SP, FBR, HP, and HALT registers all store non-negative integers and have the following purposes:
This section provides an overview of the SaM instruction set. You do not need all of these instructions for the first assignment.
SaM has five instruction classifications:
The instructions are stored in the Program area of memory. After the execution of an ALU, stack, register, or I/O instruction, control transfers to the next instruction in program memory. A control instruction “moves” execution to another instruction in program memory.
SaM’s Stack memory is essentially a column in which you can place “items” in cells, one above the other. Each cell has an address, starting from 0. For the first instruction that generates a result, SaM stores that value in this first cell. The next result gets stored above this first cell in the cell with the address of 1, and so forth. To help keep track of which cell is empty, SaM uses the SP (stack pointer) register to “point” to the next free cell. How does SP “point” to a cell? By storing the address of the next free cell in SP. SaM can retrieve that free address from the SP whenever a program requires something pushed or popped from the stack.
For an example of this process, refer to Figure 1. You can see the results of two instructions that pushed the values of 10 and then 20 onto the stack and moved SP at each step. The process went as follows:
An instruction set is the complete collection of the instructions that a CPU uses. Many instructions operate on operands, which are stored in the stack and are, thus, sometimes called stack elements. We denote a stack element at location i as Vi. Assume that, before an instruction is executed, Vtop and Vbelow refer to the top-most element and the element below it, respectively. As discussed in the previous section, SaM’s stack pointer (SP) always points to the first free location on the stack. Therefore, we have Vtop = VSP-1 and Vbelow = VSP-2. Several commands operate on operand Vtop or on operands Vtop and Vbelow; for such commands, SaM will pop Vtop or pop both Vtop and Vbelow before pushing any results onto the stack. Note that all op-codes are strictly uppercase!
Only a few of these instructions are needed for the first assignment.
|FTOI||push converted, truncated Vtop|
|FTOIR||push converted, rounded Vtop|
|ITOF||push converted Vtop|
|ADD, ADDF||push Vbelow + Vtop|
|SUB, SUBF||push Vbelow - Vtop|
|TIMES, TIMESF||push Vbelow * Vtop|
|DIV, DIVF||push Vbelow / Vtop|
|MOD||push Vbelow mod Vtop|
|LSHIFT b||push (left shift Vtop by b bits)|
|RSHIFT b||push (right shift Vtop by b bits)|
|NOT, BITNOT||push (not Vtop)|
|OR, BITOR||push Vbelow or Vtop|
|NOR, BITNOR||push not (Vbelow or Vtop)|
|AND, BITAND||push Vbelow and Vtop|
|NAND, BITNAND||push not (Vbelow and Vtop)|
|XOR, BITXOR||push Vbelow xor Vtop|
|GREATER||push Vbelow > Vtop|
|LESS||push Vbelow < Vtop|
|EQUAL||push Vbelow = Vtop|
|ISNIL||push Vtop is zero|
|ISPOS||push (Vtop is positive)|
|ISNEG||push (Vtop is negative)|
|CMP, CMPF||push -1, 0, or +1 for Vbelow <, =, or > Vtop|
Stack Manipulation Instructions
|PUSHIMM c||push integer c onto stack|
|PUSHIMMF c||push float c onto stack|
|PUSHIMMCH c||push char c onto stack|
|PUSHIMMSTR s||place string s into heap; push address of s onto stack|
|PUSHIMMPA label||push a Program Address (indicated by label) onto stack|
|DUP||duplicate top stack item|
|SWAP||exchange top two stack items|
|MALLOC||set aside a block in the heap for data of size Vtop;
push address of block onto stack;
the block includes an extra word (stored at address)
indicating the block size (including the extra word)
|PUSHIND||push Vm onto stack where m is Vtop|
|STOREIND||store Vtop into location Vm where m is Vbelow|
|PUSHOFF k||VFBR+k onto stack|
|STOREOFF k||store Vtop into location VFBR+k|
|ADDSP c||set SP to SP + c|
Register Save/Restore Instructions
|PUSHSP||push value of SP onto stack|
|POPSP||set value of SP to Vtop|
|PUSHFBR||push value of FBR onto stack|
|POPFBR||set value of FBR to Vtop|
|LINK||create new stack frame;
push FBR onto stack; set FBR to SP-1
|FSET c||restore stack frame;
set FBR to SP - (c + 1)
|JUMP t||set PC to t|
|JUMPC t||set PC to t if Vtop is nonzero (else PC is incremented as
Vtop is popped from stack
|JUMPIND||set PC to Vtop|
|JSR t||push PC+1 onto stack; set PC to t|
|JSRIND||set temp to Vtop; push PC+1 onto stack; set PC to temp|
|SKIP||set PC to PC + Vtop + 1|
|READ, READF, READCH||read integer, float, or char and push onto stack|
|READSTR||read string; store in heap and push address onto stack|
|WRITE, WRITEF, WRITECH||write Vtop onto output|
|WRITESTR||write string stored at location Vtop onto output|
A SaM program is a text file that contains a collection of sam-code instructions along with optional comments and labels:
For example, the following sam-code program pushes two values (10, then 20) onto the stack, checks if they are equal, returns the result of 0 (because they are not equal) and then halts:
PUSHIMM 10 // Push 10
PUSHIMM 20 // Push 20
EQUAL // Push 0 (because 10 != 20)
STOP // halt
We recommend that you use the commenting style shown in the above example.
To run a sam-code program, you need to use SaM. This section explains how to start the SaM simulator and run your sam-code programs inside of it. To run it, follow these steps:
The source code is also available at the course website in the form of a .zip file. If you prefer, you can run SaM by compiling the code from the .zip file:
Once the simulator is running, you should see something like the picture below.
To run a sam-code program, follow these steps:
There are a few general messages that SaM reports in its Console:
If you execute the program using the Capture button, the Capture Viewer window appears. This displays the program on the left followed by stack views for the different execution stages on the right. This facility works best for sam-code programs consisting of just a few instructions.
When SaM encounters a STOP command, it displays the value in address 0 (i.e., the bottom of the stack). Thus, you should be careful to leave a value on the stack only if you intend to return that value.
As SaM executes instructions, you will see values pushed and popped onto the stack in the Stack panel. For example, run the following sam-code:
As shown in the figure above, after running the program, SaM has pushed the values 10 and 20 and then stopped. Along with each pushed value, SaM gives the value's address and type with the format address:type:value. For example, 1:I:20 means that the value 20 has address 1 and type I ( for Integer). To keep track of the types, refer to the Display->Stack Colors Reference menu. In the first assignment, you will see just I for integer values and M for memory values.