CS412/413 Spring 1999 Introduction to Compilers and Translators

Iota Pentium Code Samples


Here is some sample Pentium code for the Iota code samples given in the Iota language definition. These code samples were actually generated from the Iota functions rewritten as C equivalents, with some contrived constructions to illustrate some of the Pentium code. The instructions given assume a Windows operating system and Microsoft development tools.


Work Environment

You will need to set up your command-line environment so that certain programs (the Microsoft C compiler and linker) are in your PATH; you can copy the following text into a file called env.bat. MASM is installed in C:\MASM611 on the CSUGLAB machines.

@echo off
call "c:\program files\microsoft visual studio\vc98\bin\vcvars32"
set PATH=c:\MASM611\bin;%PATH%

The next batch file (c2asm.bat) allows you to create your own assembly code from C code, the switches (in particular, /FAs) are case-sensitive:

@echo off
cl /FAs /nologo /c %1 %2 %3 %4 %5 %6 %7 %8 %9

Unix users can simply give gcc the -S option to generate assembly code into a file with a .s extension:

gcc -S [file.c]

We have provided batch files to assemble your code (asm.bat) and link your programs (ilink.bat). The Unix equivalents are as (or gas for GNU assembler) and ld. The actual command-line arguments will vary by operating system. The Solaris commands are:

as [file.s] -o [file.o]
ld -o [exe] [file.o] iota.o -lc

Now, activate your PATH:

Z:\tsai> env

The C code in this document was written in samples.c, and the Pentium assembler code samples.asm was generated by issuing:

Z:\tsai> c2asm samples.c

The resulting assembler code was assembled to samples.obj:

Z:\tsai> asm samples.asm

Now link your program:

Z:\tsai> ilink samples.obj

This will produce an executable file of the same name as the first object file provided as a command-line argument to LINK (in this case, samples.exe. A final batch program (make.bat) may be useful; assuming that your program resides completely in a single source ASM file:

@echo off
call asm %1.asm
call link %1.obj

For these examples, we could have started with samples.c and ended up with a samples.exe executable as follows:

c2asm samples.c
make samples

Notes about Assembler code

There are some notes about the assembler code generated by the Microsoft C compiler, and what kind of assembler code is strictly necessary:

Now let's start coding!


Assembly Code File Header

This is the header you should use in all of your generated assembler code:

; beginning of file

TITLE samples.c               ; name of original source file
.386                          ; target processor
.model FLAT                   ; flat memory model, as opposed to old MS-DOS models

; now list all externally-defined function names, they are all EXTRN NEAR

EXTRN _io__print:NEAR
EXTRN _io__printi:NEAR
EXTRN _conv__stoa:NEAR
; etc. for all standard Iota modules as well as any other prototyped functions

; now list all exported functions available to other modules to be linked in later
; (for example, "static" functions in C would not be listed here since they have only
; file scope).

PUBLIC _samples__fibonacci
PUBLIC _samples__square
PUBLIC _samples__quicksort
PUBLIC _samples__gcd
PUBLIC _iota__main

; Data segment: place all statically-allocated data (such as string constants) here.
; These directives are on MASM Programmer's Guide page 86.
;    <masm_identifier> BYTE "string"    for strings (BYTE arrays)
;    <masm_identifier> DWORD <unsigned 32-bit value>
;    <masm_identifier> SDWORD <signed 32-bit value>
; Hex digits must be prefixed with a zero and suffixed with an 'H', and should be 
; separated from other characters in the string by a comma. Example below

_DATA SEGMENT
    string1 DB "Hello, World!", 0aH, 00H
    format1 DB "0x%d%d", 00H
_DATA ENDS

; finally, our _TEXT segment follows ...

_TEXT SEGMENT


Fibonacci

Note that the $ character is a legal identifier character in Microsoft assembler (MASM Programmer's Guide, p. 9). The Microsoft C compiler happens to like to list the assembler identifiers before the subroutine names, but they can actually go anywhere.

_x$ = 8                                        ; alias "_x" as first argument
_x1$ = -4                                      ; alias "_x1" as first local
_x2$ = -8                                      ; alias "_x2" as second local
_samples__fibonacci PROC NEAR                   ; flat 32-bit memory, every PROC is NEAR

; 3    : {

        push    ebp                            ; store base pointer on stack
        mov     ebp, esp                       ; stack pointer is now new base pointer
        sub     esp, 8                         ; grow stack by two 4-byte values (2 locals)

; 4    :     int x1, x2;
; 5    :
; 6    :     if (x < 2) {

        cmp     DWORD PTR _x$[ebp], 2          ; compare "_x" with 2
        jge     SHORT $L35                     ; "_x" > 2; jump to label $L35

; 7    :        return 1;

        mov     eax, 1                         ; set return value (EAX) to 1
        jmp     SHORT $L32                     ; jump to end of current subroutine
$L35:

; 9    : 
; 10   :     x1 = fibonacci (x - 1);

        mov     eax, DWORD PTR _x$[ebp]        ; load "_x" into EAX
        sub     eax, 1                         ; subtract 1 from EAX
        push    eax                            ; push arg EAX
        call    _samples__fibonacci                     ; call subroutine
        add     esp, 4                         ; pop args
        mov     DWORD PTR _x1$[ebp], eax       ; store return value (EAX) into local "_x1"

; 11   :     x2 = fibonacci (x - 2);

        mov     ecx, DWORD PTR _x$[ebp]        ; load "x" into ECX
        sub     ecx, 2                         ; subtract 2 from ECX
        push    ecx                            ; push arg ECX
        call    _samples__fibonacci                     ; call subroutine
        add     esp, 4                         ; pop args
        mov     DWORD PTR _x2$[ebp], eax       ; store return value (EAX) into local "_x2"

; 12   :     return x1 + x2;

        mov     eax, DWORD PTR _x1$[ebp]       ; load local "_x1" into EAX
        add     eax, DWORD PTR _x2$[ebp]       ; add local "_x2" into EAX
$L32:

; 13   : }

        mov     esp, ebp                       ; restore old stack pointer
        pop     ebp                            ; retrieve old base pointer from stack
        ret     0                              ; return to caller
_samples__fibonacci ENDP

Square

Note that the assembler "_x" variable has been re-defined to be 8, overwriting the previous value assigned in "_samples__fibonacci".

_x$ = 8
_samples__square PROC NEAR

; 17   : {

        push    ebp
        mov     ebp, esp

; 18   :     return x * x;

        mov     eax, DWORD PTR _x$[ebp]
        imul    eax, DWORD PTR _x$[ebp]

; 19   : }

        pop     ebp
        ret     0
_samples__square ENDP

GCD

Note that the local "temp" has been allocated on the stack at the beginning, even though it isn't declared until the middle of the subroutine (the "51" appended to the end of the name is the Microsoft C compiler's way of maintaining scope information). Note also the funkiness associated with division, documentation is in the MASM Programmer's Guide, page 90 and page 97. Finally, note the creative use of "push ecx" in the subroutine prologue; it accomplishes the same thing (pushing ESP down the stack) as "sub esp, 4", but with a smaller and faster instruction (so it doesn't matter that it is ecx that is pushed, just as long as something is pushed to move ESP farther down the stack).

_x$ = 8
_y$ = 12
_temp$51 = -4
_samples__gcd    PROC NEAR

; 23   : {

        push    ebp
        mov     ebp, esp
        push    ecx                            ; just a way of pushing SP down the stack,
                                               ; this is smaller and faster than
                                               ; "sub esp, 4"
$L48:

; 24   :     while (x != 0) {

        cmp     DWORD PTR _x$[ebp], 0
        je      SHORT $L49

; 25   :        if (x < y) {

        mov     eax, DWORD PTR _x$[ebp]
        cmp     eax, DWORD PTR _y$[ebp]
        jge     SHORT $L50

; 26   :            int temp = x;

        mov     ecx, DWORD PTR _x$[ebp]
        mov     DWORD PTR _temp$51[ebp], ecx

; 27   :            x = y;

        mov     edx, DWORD PTR _y$[ebp]
        mov     DWORD PTR _x$[ebp], edx

; 28   :            y = temp;

        mov     eax, DWORD PTR _temp$51[ebp]
        mov     DWORD PTR _y$[ebp], eax
$L50:

; 30   :        x = x % y;

        mov     eax, DWORD PTR _x$[ebp]
        cdq                                        ; MASM Programmer's Guide p.90
        idiv    DWORD PTR _y$[ebp]                 ; MASM Programmer's Guide, p.97
        mov     DWORD PTR _x$[ebp], edx            ; quotient put in EAX,
                                                   ; remainder is put in EDX

; 31   :     }

        jmp     SHORT $L48
$L49:

; 32   :     return y;

        mov     eax, DWORD PTR _y$[ebp]

; 33   : }

        mov     esp, ebp
        pop     ebp
        ret     0
_samples__gcd    ENDP

Quicksort

First, partition. Here we get to see some of the Pentium CISC-y addressing modes, designed for array accesses. Here we also run into some callee-save registers (for C calling convention on Pentium). The callee-save registers are listed in the MASM Programmer's Guide page 310, and are EBP, ESI, EDI, EDS, and ESS. Also, the callee must preserve the direction flag (described in MASM Programmer's Guide at the bottom of page 110). The ESI register is used towards the end of this subroutine, so we save its value on the stack before overwriting it.

Some related instructions:

Finally, notice the inefficiency of the while(1) expression. If you like, you can compare this with the code resulting from for(;;). The latter almost always comes out better, but with typical optimization, both loops look the same.

_a$ = 8
_low$ = 12
_high$ = 16
_x$ = -4
_i$ = -8
_j$ = -12
_temp$73 = -16
_samples__partition PROC NEAR

; 37   : {

	push	ebp
	mov	ebp, esp
	sub	esp, 16					; 00000010H
	push	esi                  ; ESI is a callee-save register

; 38   :     int x = a[low];

	mov	eax, DWORD PTR _low$[ebp]
	mov	ecx, DWORD PTR _a$[ebp]
	mov	edx, DWORD PTR [ecx+eax*4]
	mov	DWORD PTR _x$[ebp], edx

; 39   :     int i = low - 1;

	mov	eax, DWORD PTR _low$[ebp]
	sub	eax, 1
	mov	DWORD PTR _i$[ebp], eax

; 40   :     int j = high + 1;

	mov	ecx, DWORD PTR _high$[ebp]
	add	ecx, 1
	mov	DWORD PTR _j$[ebp], ecx
$L64:

; 41   :     while (1) {

	mov	edx, 1
	test	edx, edx
	je	SHORT $L65
$L67:

; 42   : 	while (a[--j] > x);

	mov	eax, DWORD PTR _j$[ebp]
	sub	eax, 1
	mov	DWORD PTR _j$[ebp], eax
	mov	ecx, DWORD PTR _j$[ebp]
	mov	edx, DWORD PTR _a$[ebp]
	mov	eax, DWORD PTR [edx+ecx*4]
	cmp	eax, DWORD PTR _x$[ebp]
	jle	SHORT $L68
	jmp	SHORT $L67
$L68:

; 43   : 	while (a[++i] < x);

	mov	ecx, DWORD PTR _i$[ebp]
	add	ecx, 1
	mov	DWORD PTR _i$[ebp], ecx
	mov	edx, DWORD PTR _i$[ebp]
	mov	eax, DWORD PTR _a$[ebp]
	mov	ecx, DWORD PTR [eax+edx*4]
	cmp	ecx, DWORD PTR _x$[ebp]
	jge	SHORT $L71
	jmp	SHORT $L68
$L71:

; 44   : 	if (i < j) {

	mov	edx, DWORD PTR _i$[ebp]
	cmp	edx, DWORD PTR _j$[ebp]
	jge	SHORT $L72

; 45   : 	    int temp = a[i];

	mov	eax, DWORD PTR _i$[ebp]
	mov	ecx, DWORD PTR _a$[ebp]
	mov	edx, DWORD PTR [ecx+eax*4]
	mov	DWORD PTR _temp$73[ebp], edx

; 46   : 	    a[i] = a[j];

	mov	eax, DWORD PTR _i$[ebp]
	mov	ecx, DWORD PTR _a$[ebp]
	mov	edx, DWORD PTR _j$[ebp]
	mov	esi, DWORD PTR _a$[ebp]
	mov	edx, DWORD PTR [esi+edx*4]
	mov	DWORD PTR [ecx+eax*4], edx

; 47   : 	    a[j] = temp;

	mov	eax, DWORD PTR _j$[ebp]
	mov	ecx, DWORD PTR _a$[ebp]
	mov	edx, DWORD PTR _temp$73[ebp]
	mov	DWORD PTR [ecx+eax*4], edx

; 48   : 	} else {

	jmp	SHORT $L74
$L72:

; 49   : 	    return j;

	mov	eax, DWORD PTR _j$[ebp]
	jmp	SHORT $L59
$L74:

; 51   :     }

	jmp	SHORT $L64
$L65:

; 52   :     return 0;

	xor	eax, eax
$L59:

; 53   : }

	pop	esi                             ; restore callee-saved ESI register
	mov	esp, ebp
	pop	ebp
	ret	0
_samples__partition ENDP

Then, quicksort. Note here the C calling convention: multiple arguments to a subroutine are pushed on the stack in reverse order.

_a$ = 8
_low$ = 12
_high$ = 16
_mid$ = -4
_samples__quicksort PROC NEAR

; 57   : {

	push	ebp
	mov	ebp, esp
	push	ecx                                ; sub esp, 4

; 58   :     int mid;
; 59   : 
; 60   :     if (!(low < high)) {

	mov	eax, DWORD PTR _low$[ebp]
	cmp	eax, DWORD PTR _high$[ebp]
	jl	SHORT $L84

; 61   : 	return;

	jmp	SHORT $L82
$L84:

; 63   : 
; 64   :     mid = partition (a, low, high);

	mov	ecx, DWORD PTR _high$[ebp]
	push	ecx
	mov	edx, DWORD PTR _low$[ebp]
	push	edx
	mov	eax, DWORD PTR _a$[ebp]
	push	eax
	call	_samples__partition
	add	esp, 12					; 0000000cH
	mov	DWORD PTR _mid$[ebp], eax

; 65   :     quicksort (a, low, mid);

	mov	ecx, DWORD PTR _mid$[ebp]
	push	ecx
	mov	edx, DWORD PTR _low$[ebp]
	push	edx
	mov	eax, DWORD PTR _a$[ebp]
	push	eax
	call	_samples__quicksort
	add	esp, 12					; 0000000cH

; 66   :     quicksort (a, mid + 1, high);

	mov	ecx, DWORD PTR _high$[ebp]
	push	ecx
	mov	edx, DWORD PTR _mid$[ebp]
	add	edx, 1
	push	edx
	mov	eax, DWORD PTR _a$[ebp]
	push	eax
	call	_samples__quicksort
	add	esp, 12					; 0000000cH
$L82:

; 67   : }

	mov	esp, ebp
	pop	ebp
	ret	0
_samples__quicksort ENDP

Hello, world!

An oldie but goodie. Note that the string1 constant from the _DATA SEGMENT shown above finally makes its appearance here:

_iota__main	PROC NEAR

; 71   : {

	push	ebp
	mov	ebp, esp

; 72   :     io__print ("Hello World!\N");

	push	OFFSET FLAT:string1
	call	_io__print
	add	esp, 4

; 73   : }

	pop	ebp
	ret	0
_iota__main	ENDP

Assembly Code File Footer

Now, end the text segment and the assembler file:

_TEXT ENDS
END

Iota Implementation

There are a few calls available to assembler code that are not exposed to the Iota programmer:

All that remains is to illustrate how to implement new for a string or array. The following piece of code constructs an Iota string from a statically-allocated string, and then prints it out. Next it creates an array, fills it with some value, and then prints out its ASCII equivalent. It not intended to show code that would be generated by a PA3 compiler; it is intended to illustrate the Iota API.

Also note that the entry point to the program is named _iota__main.

TITLE hello.iota
.386
.model FLAT

EXTRN _io__print:NEAR           ; all externally-defined funcs
EXTRN _iota__newstring:NEAR
EXTRN _iota__newarray:NEAR
EXTRN _iota__debug:NEAR
EXTRN _conv__atos:NEAR

PUBLIC _iota__main

_DATA SEGMENT

    hw BYTE "Hello, world!", 00H
    hwlen DWORD 13
    newline BYTE 0aH, 00H

_DATA ENDS

_TEXT SEGMENT

_s0$ = -4;
_s$ = -8;
_t$ = -12;
_iota__main PROC NEAR

    push ebp                    ; save Base Pointer
    mov ebp, esp                ; Stack Pointer => Base Pointer

    sub esp, 12

    push 1                      ; debug
    call _iota__debug
    add esp, 4

    push DWORD PTR hwlen            ; allocate a new string s0
    call _iota__newstring
    add esp, 4
    mov DWORD PTR _s0$[ebp], eax    ; s0 is a new string

    mov eax, DWORD PTR _s0$[ebp]    ; s = s0
    mov DWORD PTR _s$[ebp], eax

    mov DWORD PTR _t$[ebp], OFFSET hw   ; t = "Hello, world!"

LOOP_TEST:

    mov ecx, DWORD PTR _t$[ebp]     ; ecx = t
    movsx edx, BYTE PTR [ecx]       ; edx = *t
    test edx, edx                   ; edx == 0?
    jz LOOP_DONE

    mov eax, DWORD PTR _s$[ebp]
    mov ecx, DWORD PTR _t$[ebp]
    mov dl, BYTE PTR [ecx]          ; dl is an 8-bit register, for chars
    mov BYTE PTR [eax], dl          ; *s = *t

    add eax, 1
    mov DWORD PTR _s$[ebp], eax     ; s++
    add ecx, 1
    mov DWORD PTR _t$[ebp], ecx     ; t++

    jmp LOOP_TEST

LOOP_DONE:

    mov eax, DWORD PTR _s$[ebp]
    mov BYTE PTR [eax], 0           ; *s = '\0'

    mov eax, DWORD PTR _s0$[ebp]
    push eax
    call _io__print
    add esp, 4

    push OFFSET newline         ; print a newline
    call _io__print
    add esp, 4

    push 65
    push 10
    call _iota__newarray	; { v: array[int] = new int[10] = 65; }
    add esp, 8
    mov DWORD PTR _v$[ebp], eax

    push DWORD PTR _v$[ebp]
    call _conv__atos
    add esp, 4
    
    push eax
    call _io__print
    add esp, 4

    push OFFSET newline
    call _io__print
    add esp, 4

    push 0                      ; no debugging
    call _iota__debug
    add esp, 4

    xor eax, eax                ; return code (same as "mov eax, 0")

    mov esp, ebp                ; old stack frame
    pop ebp                     ; restore Base Pointer
    ret 0                       ; return to caller
_iota__main ENDP

_TEXT ENDS

END

The output (note that iota__malloc is unavailable to both the compiler-writer and the Iota programmer; the output is provided to let you, the compiler-writer, know how much memory is really being allocated):

Z:\tsai\cs412\iota>hello
Debugging turned on.
iota__newstring (13)
iota__malloc (18)
Hello, world!
iota__newarray (10, 65)
iota__malloc (44)
iota__malloc (15)
conv__atos () returned "AAAAAAAAAA"
AAAAAAAAAA
Debugging turned off.

Reference

Compiler-writer routines available to assembler code:

char *
iota__newstring (int numchars);	// number of non-NULL characters

int *
iota__newarray (int numelements, int val);	// number of elements, and 
                                                // their initial value
void
iota__debug (int flag);

void
iota__abort (void);

The routines available to both the compiler-writer and the Iota programmer, from the io and conv modules:

void
io__print (char *s);

void
io__printi (int i);

void
io__putc (int i);

char *
io__readln (void);

int
io__getc (void);

int
io__eof (void);

char *
conv__itos (int i);

int
conv__stoi (char *string, int error);

char *
conv__itoc (int i);

char *
conv__atos (int *a);

int *
conv__stoa (char *s);