CS412/413
due Wednesday, April 28 in class
In this programming assignment, you will build a static checker and code generator for the language Iota+, which is defined in the handout www.cs/cs412-sp99/18-iota+.htm. Your program will determine whether the input source file is a legal module, in conjunction with any interface files that it depends on, and report any lexical, syntactic, or semantic errors in the module or interfaces. If the module is legal, it will be compiled to assembly code in the same fashion as for Programming Assignment 3. We expect you to build upon the code that you wrote for Programming Assignments 1-3, and to fix any problems with your code generation that interferes with code generation for correct Iota programs.
This programming assignment differs from earlier assignments in that we are providing code for you to base your development on. We recommend but do not require that you implement the programming assignment using the provided code. We are providing a partial implementation of Programming Assignment 4, based on the reference implementation of Programming Assignment 2 that was released on the web site. This partial implementation includes a complete lexical analysis, a mostly-complete syntactic phase and the beginnings of a semantic analysis phase. The syntactic phase is written using Java CUP, an LR parser generator, which will make extending the grammar much simpler than it would be in a recursive-descent parser. To use this code, you will need to integrate your implementation with the provided code and do the following pieces of work:
Few if any modifications to assembly code generation should be required, unless there are bugs in your assembly code generation that need to be fixed. We still do not ask that register allocation be implemented, for example.
The name of the compiler class and its syntax for invocation remain the same as in Programming Assignment 3.
Global variable initialization is now supported, which leads to the question: where does the initialization code go? Initialization of the global variables in each module will be performed by a special function that is invoked when the program starts. We have provided some extra support to make this feasible:
iota0
that
will take object file names, extract the module name from them and generate assembler code
for a function named _iota__init
, which simply calls each of the modules' _foo$init
functions. The Iota Standard Library main
function will now call iota__init
before calling your iota__main
. To build iota0, issue the following
command: z:\tsai\cs412> cl iota0.c
foo.mod
, your compiler will generate a function named _foo$init
that will perform all global variable initialization. It should look like the following: PUBLIC _foo$init _foo$init PROC NEAR push ebp mov ebp, esp ; global initialization code here pop ebp ret _foo$init ENDP
We are not worried about any naming clashes because the '$' character, while invalid as an Iota identifier character, is valid as a MASM identifier character.
iota0
before linking your modules together, to
ensure that iota__init
is generated properly each time.The object file iota.obj
contains the code
for the standard modules io
and conv
that are defined in the
language specification. This object file has been extended to include support for the new
io objects described in the io
interface.
Iota+ adds class types to the set of types supported. To allow standardization, we have selected a mandatory representation for values of class type, which is the format described in class for class hierarchies with single inheritance. An example of this representation is depicted below. An object begins with a single word pointing to the dispatch vector for the object's class. The fields of the object follow in the order they are declared in the class. All fields, even booleans, take up a single 4-byte word. The dispatch vector contains pointers to the code for each of the methods of the object. The methods include not only the methods explicitly declared in the class of the object, but also methods from each of the interface types, if any, that the class declares. The methods are numbered starting from zero, with each method having an index one greater than the previous one in the same class or interface type. The first method in a class or interface has an index one greater than the last method in the immediate supertype, or zero if there is no immediate supertype.
For example, consider the following class and interface declarations, assuming that sqrt
and atan2
are defined in some appropriate fashion:
interface Point {
x() : int
y() : int
}
class Cartesian extends Point {
x_, y_ : int
x(): int = x_
y(): int = y_
r(): int = sqrt(x*x + y*y)
theta(): int = atan2(y, x)
}
These definitions result in an object of type Cartesian
having the memory layout shown in the picture above.
The special value null
may be used as a value of any object
type. It is represented by a word containing the address 0. To create storage for a real
object, it will be necessary to call the routine iota_malloc
, passing the
size of the object in bytes. This routine returns the address of the allocated storage.
The fields of the object then can be zeroed to give them all their default values, and the
pointer to the dispatch vector must be installed. Since all objects of a particular class
share the same dispatch vector, there is no need to allocate space for the dispatch vector
dynamically. The dispatch vector is global storage and should be placed in the data
segment along with other global variables.
Methods are implemented in a manner similar to functions. They take an
additional first argument that is the object this
, on which the method was
invoked. Another difference is that method names are mangled differently. In the generated
assembly code, the label beginning the code of a method m
in a class C
is C__m
. Methods are not exported as public identifiers, so the name of the
containing module does not need to be part of the mangled name.
The special cast
expression performs a dynamic cast of an
expression to a new object type. The Iota+ language definition requires the cast be
checked for validity, similarly to Java. We are not requiring that you implement this
dynamic check for this programming assignment. It can be implemented in a number of ways.
For example, when casting to a class type, the dispatch vector pointer can be compared
against the dispatch vector of the class. Implementing a cast to an interface type is more
complex.
The supplied code that partially implements Programming Assignment 4 is available in Windows and Unix formats. Please read the file README-pa4.txt for information about this code. We will make known any updates to this code.
Submission of this programming assignment will be entirely electronic. When you submit the programming assignment, send mail to cs412 telling us that you have done so. The submission instructions are similar to Programming Assignment 3. The submitted directory must contain a README file telling us the following:
The following additional documents should be included in the submission
The following additional material should be included in the submission and the README file should say where to find it: