CS412/413 Spring 1999 Introduction to Compilers and Translators
Here is some sample Pentium code for the Iota code samples given in the Iota language definition. These code samples were actually generated from the Iota functions rewritten as C equivalents, with some contrived constructions to illustrate some of the Pentium code. The instructions given assume a Windows operating system and Microsoft development tools.
You will need to set up your command-line environment so that certain programs (the
Microsoft C compiler and linker) are in your PATH; you can copy the following text into a
file called env.bat. MASM is installed in C:\MASM611 on the CSUGLAB machines.
@echo off call "c:\program files\microsoft visual studio\vc98\bin\vcvars32" set PATH=c:\MASM611\bin;%PATH%
The next batch file (c2asm.bat) allows you
to create your own assembly code from C code, the switches (in particular, /FAs)
are case-sensitive:
@echo off cl /FAs /nologo /c %1 %2 %3 %4 %5 %6 %7 %8 %9
Unix users can simply give gcc the -S option to generate
assembly code into a file with a .s extension:
gcc -S [file.c]
We have provided batch files to assemble your code (asm.bat)
and link your programs (ilink.bat). The Unix
equivalents are as (or gas for GNU assembler) and ld.
The actual command-line arguments will vary by operating system. The Solaris commands are:
as [file.s] -o [file.o] ld -o [exe] [file.o] iota.o -lc
Now, activate your PATH:
Z:\tsai> env
The C code in this document was written in samples.c, and the Pentium
assembler code samples.asm was generated by issuing:
Z:\tsai> c2asm samples.c
The resulting assembler code was assembled to samples.obj:
Z:\tsai> asm samples.asm
Now link your program:
Z:\tsai> ilink samples.obj
This will produce an executable file of the same name as the first object file provided
as a command-line argument to LINK (in this case, samples.exe. A final batch
program (make.bat) may be useful; assuming that your program resides
completely in a single source ASM file:
@echo off call asm %1.asm call link %1.obj
For these examples, we could have started with samples.c and ended up with
a samples.exe executable as follows:
c2asm samples.c make samples
There are some notes about the assembler code generated by the Microsoft C compiler, and what kind of assembler code is strictly necessary:
DWORD PTR (32-bit
values) or BYTE PTR (8-bit values, i.e., character bytes).jl, jge, jmp, etc.) can be simply issued with a
destination label; there is no need for the SHORT keyword.OFFSET <identifier>;
specifying OFFSET FLAT:<identifier> is unnecessary.Now let's start coding!
This is the header you should use in all of your generated assembler code:
; beginning of file
TITLE samples.c ; name of original source file
.386 ; target processor
.model FLAT ; flat memory model, as opposed to old MS-DOS models
; now list all externally-defined function names, they are all EXTRN NEAR
EXTRN _io__print:NEAR
EXTRN _io__printi:NEAR
EXTRN _conv__stoa:NEAR
; etc. for all standard Iota modules as well as any other prototyped functions
; now list all exported functions available to other modules to be linked in later
; (for example, "static" functions in C would not be listed here since they have only
; file scope).
PUBLIC _samples__fibonacci
PUBLIC _samples__square
PUBLIC _samples__quicksort
PUBLIC _samples__gcd
PUBLIC _iota__main
; Data segment: place all statically-allocated data (such as string constants) here.
; These directives are on MASM Programmer's Guide page 86.
; <masm_identifier> BYTE "string" for strings (BYTE arrays)
; <masm_identifier> DWORD <unsigned 32-bit value>
; <masm_identifier> SDWORD <signed 32-bit value>
; Hex digits must be prefixed with a zero and suffixed with an 'H', and should be
; separated from other characters in the string by a comma. Example below
_DATA SEGMENT
string1 DB "Hello, World!", 0aH, 00H
format1 DB "0x%d%d", 00H
_DATA ENDS
; finally, our _TEXT segment follows ...
_TEXT SEGMENT
Note that the $ character is a legal identifier character in Microsoft assembler (MASM Programmer's Guide, p. 9). The Microsoft C compiler happens to like to list the assembler identifiers before the subroutine names, but they can actually go anywhere.
_x$ = 8 ; alias "_x" as first argument
_x1$ = -4 ; alias "_x1" as first local
_x2$ = -8 ; alias "_x2" as second local
_samples__fibonacci PROC NEAR ; flat 32-bit memory, every PROC is NEAR
; 3 : {
push ebp ; store base pointer on stack
mov ebp, esp ; stack pointer is now new base pointer
sub esp, 8 ; grow stack by two 4-byte values (2 locals)
; 4 : int x1, x2;
; 5 :
; 6 : if (x < 2) {
cmp DWORD PTR _x$[ebp], 2 ; compare "_x" with 2
jge SHORT $L35 ; "_x" > 2; jump to label $L35
; 7 : return 1;
mov eax, 1 ; set return value (EAX) to 1
jmp SHORT $L32 ; jump to end of current subroutine
$L35:
; 9 :
; 10 : x1 = fibonacci (x - 1);
mov eax, DWORD PTR _x$[ebp] ; load "_x" into EAX
sub eax, 1 ; subtract 1 from EAX
push eax ; push arg EAX
call _samples__fibonacci ; call subroutine
add esp, 4 ; pop args
mov DWORD PTR _x1$[ebp], eax ; store return value (EAX) into local "_x1"
; 11 : x2 = fibonacci (x - 2);
mov ecx, DWORD PTR _x$[ebp] ; load "x" into ECX
sub ecx, 2 ; subtract 2 from ECX
push ecx ; push arg ECX
call _samples__fibonacci ; call subroutine
add esp, 4 ; pop args
mov DWORD PTR _x2$[ebp], eax ; store return value (EAX) into local "_x2"
; 12 : return x1 + x2;
mov eax, DWORD PTR _x1$[ebp] ; load local "_x1" into EAX
add eax, DWORD PTR _x2$[ebp] ; add local "_x2" into EAX
$L32:
; 13 : }
mov esp, ebp ; restore old stack pointer
pop ebp ; retrieve old base pointer from stack
ret 0 ; return to caller
_samples__fibonacci ENDP
Note that the assembler "_x" variable has been re-defined to be 8, overwriting the previous value assigned in "_samples__fibonacci".
_x$ = 8
_samples__square PROC NEAR
; 17 : {
push ebp
mov ebp, esp
; 18 : return x * x;
mov eax, DWORD PTR _x$[ebp]
imul eax, DWORD PTR _x$[ebp]
; 19 : }
pop ebp
ret 0
_samples__square ENDP
Note that the local "temp" has been allocated on the stack at the beginning,
even though it isn't declared until the middle of the subroutine (the "51"
appended to the end of the name is the Microsoft C compiler's way of maintaining scope
information). Note also the funkiness associated with division, documentation is in the
MASM Programmer's Guide, page 90 and page 97. Finally, note the creative use of "push
ecx" in the subroutine prologue; it accomplishes the same thing (pushing ESP
down the stack) as "sub esp, 4", but with a smaller and faster
instruction (so it doesn't matter that it is ecx that is pushed, just as long as something
is pushed to move ESP farther down the stack).
_x$ = 8
_y$ = 12
_temp$51 = -4
_samples__gcd PROC NEAR
; 23 : {
push ebp
mov ebp, esp
push ecx ; just a way of pushing SP down the stack,
; this is smaller and faster than
; "sub esp, 4"
$L48:
; 24 : while (x != 0) {
cmp DWORD PTR _x$[ebp], 0
je SHORT $L49
; 25 : if (x < y) {
mov eax, DWORD PTR _x$[ebp]
cmp eax, DWORD PTR _y$[ebp]
jge SHORT $L50
; 26 : int temp = x;
mov ecx, DWORD PTR _x$[ebp]
mov DWORD PTR _temp$51[ebp], ecx
; 27 : x = y;
mov edx, DWORD PTR _y$[ebp]
mov DWORD PTR _x$[ebp], edx
; 28 : y = temp;
mov eax, DWORD PTR _temp$51[ebp]
mov DWORD PTR _y$[ebp], eax
$L50:
; 30 : x = x % y;
mov eax, DWORD PTR _x$[ebp]
cdq ; MASM Programmer's Guide p.90
idiv DWORD PTR _y$[ebp] ; MASM Programmer's Guide, p.97
mov DWORD PTR _x$[ebp], edx ; quotient put in EAX,
; remainder is put in EDX
; 31 : }
jmp SHORT $L48
$L49:
; 32 : return y;
mov eax, DWORD PTR _y$[ebp]
; 33 : }
mov esp, ebp
pop ebp
ret 0
_samples__gcd ENDP
First, partition. Here we get to see some of the Pentium CISC-y addressing
modes, designed for array accesses. Here we also run into some callee-save registers (for
C calling convention on Pentium). The callee-save registers are listed in the MASM
Programmer's Guide page 310, and are EBP, ESI, EDI, EDS, and ESS. Also, the callee must
preserve the direction flag (described in MASM Programmer's Guide at the bottom of page
110). The ESI register is used towards the end of this subroutine, so we save its value on
the stack before overwriting it.
Some related instructions:
PUSHFD/POPFD (MASM Programmer's Guide page 73): these two instructions will
push onto and pop from the stack the flags register (of which the above-mentioned
direction flag is a part). PUSHAD/POPAD (MASM Programmer's Guide page 74): these two instructions will
push onto and pop from the stack all general-purpose registers: EAX, ECX, EDX, EBX, ESP,
EBP, ESI, EDI. Finally, notice the inefficiency of the while(1) expression. If you like,
you can compare this with the code resulting from for(;;). The latter almost
always comes out better, but with typical optimization, both loops look the same.
_a$ = 8
_low$ = 12
_high$ = 16
_x$ = -4
_i$ = -8
_j$ = -12
_temp$73 = -16
_samples__partition PROC NEAR
; 37 : {
push ebp
mov ebp, esp
sub esp, 16 ; 00000010H
push esi ; ESI is a callee-save register
; 38 : int x = a[low];
mov eax, DWORD PTR _low$[ebp]
mov ecx, DWORD PTR _a$[ebp]
mov edx, DWORD PTR [ecx+eax*4]
mov DWORD PTR _x$[ebp], edx
; 39 : int i = low - 1;
mov eax, DWORD PTR _low$[ebp]
sub eax, 1
mov DWORD PTR _i$[ebp], eax
; 40 : int j = high + 1;
mov ecx, DWORD PTR _high$[ebp]
add ecx, 1
mov DWORD PTR _j$[ebp], ecx
$L64:
; 41 : while (1) {
mov edx, 1
test edx, edx
je SHORT $L65
$L67:
; 42 : while (a[--j] > x);
mov eax, DWORD PTR _j$[ebp]
sub eax, 1
mov DWORD PTR _j$[ebp], eax
mov ecx, DWORD PTR _j$[ebp]
mov edx, DWORD PTR _a$[ebp]
mov eax, DWORD PTR [edx+ecx*4]
cmp eax, DWORD PTR _x$[ebp]
jle SHORT $L68
jmp SHORT $L67
$L68:
; 43 : while (a[++i] < x);
mov ecx, DWORD PTR _i$[ebp]
add ecx, 1
mov DWORD PTR _i$[ebp], ecx
mov edx, DWORD PTR _i$[ebp]
mov eax, DWORD PTR _a$[ebp]
mov ecx, DWORD PTR [eax+edx*4]
cmp ecx, DWORD PTR _x$[ebp]
jge SHORT $L71
jmp SHORT $L68
$L71:
; 44 : if (i < j) {
mov edx, DWORD PTR _i$[ebp]
cmp edx, DWORD PTR _j$[ebp]
jge SHORT $L72
; 45 : int temp = a[i];
mov eax, DWORD PTR _i$[ebp]
mov ecx, DWORD PTR _a$[ebp]
mov edx, DWORD PTR [ecx+eax*4]
mov DWORD PTR _temp$73[ebp], edx
; 46 : a[i] = a[j];
mov eax, DWORD PTR _i$[ebp]
mov ecx, DWORD PTR _a$[ebp]
mov edx, DWORD PTR _j$[ebp]
mov esi, DWORD PTR _a$[ebp]
mov edx, DWORD PTR [esi+edx*4]
mov DWORD PTR [ecx+eax*4], edx
; 47 : a[j] = temp;
mov eax, DWORD PTR _j$[ebp]
mov ecx, DWORD PTR _a$[ebp]
mov edx, DWORD PTR _temp$73[ebp]
mov DWORD PTR [ecx+eax*4], edx
; 48 : } else {
jmp SHORT $L74
$L72:
; 49 : return j;
mov eax, DWORD PTR _j$[ebp]
jmp SHORT $L59
$L74:
; 51 : }
jmp SHORT $L64
$L65:
; 52 : return 0;
xor eax, eax
$L59:
; 53 : }
pop esi ; restore callee-saved ESI register
mov esp, ebp
pop ebp
ret 0
_samples__partition ENDP
Then, quicksort. Note here the C calling convention: multiple arguments to
a subroutine are pushed on the stack in reverse order.
_a$ = 8
_low$ = 12
_high$ = 16
_mid$ = -4
_samples__quicksort PROC NEAR
; 57 : {
push ebp
mov ebp, esp
push ecx ; sub esp, 4
; 58 : int mid;
; 59 :
; 60 : if (!(low < high)) {
mov eax, DWORD PTR _low$[ebp]
cmp eax, DWORD PTR _high$[ebp]
jl SHORT $L84
; 61 : return;
jmp SHORT $L82
$L84:
; 63 :
; 64 : mid = partition (a, low, high);
mov ecx, DWORD PTR _high$[ebp]
push ecx
mov edx, DWORD PTR _low$[ebp]
push edx
mov eax, DWORD PTR _a$[ebp]
push eax
call _samples__partition
add esp, 12 ; 0000000cH
mov DWORD PTR _mid$[ebp], eax
; 65 : quicksort (a, low, mid);
mov ecx, DWORD PTR _mid$[ebp]
push ecx
mov edx, DWORD PTR _low$[ebp]
push edx
mov eax, DWORD PTR _a$[ebp]
push eax
call _samples__quicksort
add esp, 12 ; 0000000cH
; 66 : quicksort (a, mid + 1, high);
mov ecx, DWORD PTR _high$[ebp]
push ecx
mov edx, DWORD PTR _mid$[ebp]
add edx, 1
push edx
mov eax, DWORD PTR _a$[ebp]
push eax
call _samples__quicksort
add esp, 12 ; 0000000cH
$L82:
; 67 : }
mov esp, ebp
pop ebp
ret 0
_samples__quicksort ENDP
An oldie but goodie. Note that the string1 constant from the _DATA SEGMENT
shown above finally makes its appearance here:
_iota__main PROC NEAR
; 71 : {
push ebp
mov ebp, esp
; 72 : io__print ("Hello World!\N");
push OFFSET FLAT:string1
call _io__print
add esp, 4
; 73 : }
pop ebp
ret 0
_iota__main ENDP
Now, end the text segment and the assembler file:
_TEXT ENDS END
There are a few calls available to assembler code that are not exposed to the Iota programmer:
iota__newstring. Takes one argument, the number of non-NULL
characters to allocate for the new string. The return value is a pointer in memory to the
allocated storage. iota__newarray. Takes two arguments, the first argument is the
desired number of elements in the array, the second is a value to which all elements will
be initialized. iota__debug. Takes one argument, the number 1 or 0. This sets an
internal Iota debug flag for the io and conv modules to display
what they are doing, to aid you in determining whether or not you are passing the right
arguments to the Iota libraries. iota__abort. No arguments. Causes your program to crash. Visual C++
will start another instance of itself at the location of the crash. The top few layers
will be NT and Iota library code. A few layers down from the top will be the actual Iota
assembler code that called iota__abort. All that remains is to illustrate how to implement new for a string or
array. The following piece of code constructs an Iota string from a statically-allocated
string, and then prints it out. Next it creates an array, fills it with some value, and
then prints out its ASCII equivalent. It not intended to show code that would be generated
by a PA3 compiler; it is intended to illustrate the Iota API.
Also note that the entry point to the program is named _iota__main.
TITLE hello.iota
.386
.model FLAT
EXTRN _io__print:NEAR ; all externally-defined funcs
EXTRN _iota__newstring:NEAR
EXTRN _iota__newarray:NEAR
EXTRN _iota__debug:NEAR
EXTRN _conv__atos:NEAR
PUBLIC _iota__main
_DATA SEGMENT
hw BYTE "Hello, world!", 00H
hwlen DWORD 13
newline BYTE 0aH, 00H
_DATA ENDS
_TEXT SEGMENT
_s0$ = -4;
_s$ = -8;
_t$ = -12;
_iota__main PROC NEAR
push ebp ; save Base Pointer
mov ebp, esp ; Stack Pointer => Base Pointer
sub esp, 12
push 1 ; debug
call _iota__debug
add esp, 4
push DWORD PTR hwlen ; allocate a new string s0
call _iota__newstring
add esp, 4
mov DWORD PTR _s0$[ebp], eax ; s0 is a new string
mov eax, DWORD PTR _s0$[ebp] ; s = s0
mov DWORD PTR _s$[ebp], eax
mov DWORD PTR _t$[ebp], OFFSET hw ; t = "Hello, world!"
LOOP_TEST:
mov ecx, DWORD PTR _t$[ebp] ; ecx = t
movsx edx, BYTE PTR [ecx] ; edx = *t
test edx, edx ; edx == 0?
jz LOOP_DONE
mov eax, DWORD PTR _s$[ebp]
mov ecx, DWORD PTR _t$[ebp]
mov dl, BYTE PTR [ecx] ; dl is an 8-bit register, for chars
mov BYTE PTR [eax], dl ; *s = *t
add eax, 1
mov DWORD PTR _s$[ebp], eax ; s++
add ecx, 1
mov DWORD PTR _t$[ebp], ecx ; t++
jmp LOOP_TEST
LOOP_DONE:
mov eax, DWORD PTR _s$[ebp]
mov BYTE PTR [eax], 0 ; *s = '\0'
mov eax, DWORD PTR _s0$[ebp]
push eax
call _io__print
add esp, 4
push OFFSET newline ; print a newline
call _io__print
add esp, 4
push 65
push 10
call _iota__newarray ; { v: array[int] = new int[10] = 65; }
add esp, 8
mov DWORD PTR _v$[ebp], eax
push DWORD PTR _v$[ebp]
call _conv__atos
add esp, 4
push eax
call _io__print
add esp, 4
push OFFSET newline
call _io__print
add esp, 4
push 0 ; no debugging
call _iota__debug
add esp, 4
xor eax, eax ; return code (same as "mov eax, 0")
mov esp, ebp ; old stack frame
pop ebp ; restore Base Pointer
ret 0 ; return to caller
_iota__main ENDP
_TEXT ENDS
END
The output (note that iota__malloc is unavailable to both the
compiler-writer and the Iota programmer; the output is provided to let you, the
compiler-writer, know how much memory is really being allocated):
Z:\tsai\cs412\iota>hello Debugging turned on. iota__newstring (13) iota__malloc (18) Hello, world! iota__newarray (10, 65) iota__malloc (44) iota__malloc (15) conv__atos () returned "AAAAAAAAAA" AAAAAAAAAA Debugging turned off.
Compiler-writer routines available to assembler code:
char *
iota__newstring (int numchars); // number of non-NULL characters
int *
iota__newarray (int numelements, int val); // number of elements, and
// their initial value
void
iota__debug (int flag);
void
iota__abort (void);
The routines available to both the compiler-writer and the Iota programmer, from the io
and conv modules:
void io__print (char *s); void io__printi (int i); void io__putc (int i); char * io__readln (void); int io__getc (void); int io__eof (void); char * conv__itos (int i); int conv__stoi (char *string, int error); char * conv__itoc (int i); char * conv__atos (int *a); int * conv__stoa (char *s);