Problem Set 2: Folding and Datatypes

Due February 17, 2011

Important notes:

No partners: You are to work alone on this assignment.
Compile errors: All programs that you submit must compile. Programs that do not compile will probably receive an automatic zero. If you are having trouble getting your assignment to compile, please visit consulting hours. If you run out of time, it is better to comment out the parts that do not compile and hand in a file that compiles, rather than handing in a more complete file that does not compile.
Missing functions: We will be using an automatic grading script, so it is crucial that you name your functions and order their arguments according to the problem set instructions, and that you place the functions in the correct files. Otherwise you may not receive credit for a function properly written.
Code style: Please pay attention to style. Refer to the CS 3110 style guide and lecture notes. Ugly code that is functionally correct may still lose points. Take the extra time to think through the problems and find the most elegant solutions before coding them up.
Wrong versions: Please double-check that you are submitting the correct versions of your files to CMS. The excuse, "I had it all done, but I accidentally submitted the wrong files" will not be accepted.

Part 1: List Folding (30 points)

Folding is one of the most useful tools in functional programming. You will certainly use it many times over the course of the semester. The following exercises are designed to get you comfortable with folding to do useful computations on lists. Note: All of these functions can be written without folding, but the point of the exercise is to get used to folding, so every function must include either List.fold_left or List.fold_right. Note that List.fold_left is tail recursive and List.fold_right is not. For full credit, do not use the rec keyword.

(5 points) Write a function map : ('a -> 'b) -> 'a list -> 'b list that is an implementation of the List.map function using only List.fold_left or List.fold_right.

Example: map succ [1;2;3] = [2;3;4].
(5 points) Write a function simple_stats : float list -> (float * float * float * float * int) . Given a list of floats, simple_stats finds the min, max, mean, range, and number of elements in the list passed in (range is the difference between the min and the max). For full credit, you are restricted to use only one list folding function, and all of the work must be done inside of this function.

Example: simple_stats [1.0;1.0;2.0;3.0;5.0;8.0;11.0] = (1., 11., 4.4285714285714288, 10., 7)
(10 points) Write a function hex_to_decimal : char list -> int . Given a hexadecimal number represented as a list of characters, it will convert the hexadecimal number into a integer representing the hex number in decimal.

Example: convert_hex_to_decimal ['2';'A'] = 42

You may not assume that valid characters will be passed in and should therefore consider checking for cases where the input does not have characters 0-9, A-F. (a-f should be rejected). In this case, you should raise InvalidCharacter .

Example: convert_hex_to_decimal ['K'] = Exception: InvalidCharacter "Invalid char".

(10 points) Write a function mystery_base_addition : int list -> int list -> int -> int list . Given two lists of nonnegative integers in the range 0 to b-1, each representing the base-b digits of a number from higher to lower significance, you must add the two numbers together.

Example: mystery_base_addition [0;0;1] [0;0;1] 2 = [1;0]
mystery_base_addition [2;7] [1;5] 10 = [4;2].

You should not assume that valid digits for base b will be passed in and should therefore consider checking for cases where the input contains integers that are not in range. In this case, you should raise InvalidNumber .

Example: mystery_base_addition [2;7] [1;5] 3 = Exception: InvalidNumber "Higher than base".

Hint: For this problem, you may find it to useful to use List.fold_left2 or List.fold_right2.

Part 2: Registers, Memory and Assembly Language (70 points)

In order for a computer to execute any program, all instructions must be converted into machine code. Assembly language is simply a readable representation of machine code. In order to store values and access values, assembly languages refer to the computer's registers—a very small amount of storage available on the CPU that can be accessed much faster than any other type of memory. In most cases, the CPU does not have enough registers to store all of the values that it may need at one point, so main memory (usually just referred to as memory) is available as well. Data can be moved back and forth from memory to the registers. Our objective in this problem set is to simulate the registers and memory using arrays in order to execute assembly language commands that will be loaded from a text file.

You have been provided with three arrays. The instructionArray will be used to store the instructions that are loaded from a text file. This instructions will then be executed. The memoryArray will be used to represent the computer's main memory, and the registerArray will be used to represent the computer's register set. Both the memory and registers are being represented as string arrays, because in this problem set they will always store hexadecimal numbers.

Every processor has its own machine code instruction set. We will be using a very small and simplified subset of the assembly language instructions provided by the MIPS architecture. In general, you can expect to see the following things in a file containing MIPS assembly instructions:

Instructions and registers: we will only be using a small subset of nine instructions (explained below). In order to access a register using an assembly instruction, the register's name must be specified. For simplicity, all registers will be named in the form $rx where x is an integer from 0 to 7.
Numbers: you can expect any numbers you find within instructions to be in decimal. This is the opposite of the memory and register arrays, where their values must always be hexadecimal strings.
Labels: these are individual lines always in the form "some_word:". They do not count as an instruction. They are used as a way of labeling the line immediately following. These labels are used for jump targets.
Comments: anything following a '#' is a comment and is not part of an instruction. Comments can be thrown away.

The nine instructions we will be using are as follows:

`lw $dest offset($base)`	Load Word: loads the value found in memory location [$base + offset] into the register $dest. Here $base refers to the contents of register $base and offset is a decimal number.
`sw $source offset($base)`	Store Word: stores the value found in register $source in memory location [$base + offset].
`addi $dest $reg imm`	Add Immediate: computes $reg + imm and stores it in $dest. Imm is a decimal number.
`add $dest $reg1 $reg2`	Add: computes $reg1 + $reg2 and puts the result in $dest. The registers do not have to be different.
`sub $dest $reg1 $reg2`	Sub: computes $reg1 - $reg2 and stores it in $dest.
`j label`	Jump: jumps to the instruction specified by label. Note that the label declaration must have a colon (:), but references to it in jump instructions do not.
`jr $reg`	Jump Register: jumps to the n-th instruction, where n is the value stored $reg. e.g. if $r1 = "A", then `jr $r1` jumps to the tenth instruction.
`beq $reg1 $reg2 label`	Branch on Equal: compares $reg1 and $reg2 for equality. If they are equal, then jump to the instruction specified by label.
`bne $reg1 $reg2 label`	Branch on Not Equal: if $reg1 does not equal $reg2, then jump to the instruction specified by label.

Working with arrays:

Using arrays in OCaml is very simple. To access an element, such as the second index in the registerArray, simply do the following:

registerArray.(2)

This always evaluates to a type equal to the type of the array (in this case, string).

Array elements are mutable (assignable). To mutate an element in an array, such as the third index in the memory array:

memoryArray.(3) <- "Some_Hexadecimal_Number"

This will always evaluate to a type unit.

Important Notes:

You are not allowed to create any additional arrays or any other form of mutable values other than the ones that have been given to you.
Remember you wrote a hexadecimal to decimal conversion function in Part 1. You should use that function in this part.
Even though this problem involves mutating arrays (which is imperative), you will quickly find that most of this problem is functional. This is not a step away from functional programming back into imperative programming.

Your task:

(30 points) Complete the function create_instruction_list. You have already been supplied with a function split_lines that splits a line into a string list. For example, the line
lw $r1 12($r3) #Short comment
would get split into the following string list:
["lw"; "$r1"; "12"; "$r3"; "#Short"; "comment"]

It is your job to take the string list list given to you by split_lines and convert it into an instruction list. Remember that all comments must be ignored, and you must raise a Failure when there is an incorrectly formed or incorrectly named instruction.
(5 points) Implement the function populate_instruction_array.
(20 points) Implement the nine assembly functions specified in the table above. Remember to raise an IllegalMemoryAccess exception whenever an invalid register, memory location or instruction number is trying to be accessed. Also, raise a Failure if the "addi" or "sub" instructions result in a negative number (this is because we are not implementing hexadecimal representation for negative numbers).
(15 points) Implement the function run_program. We have provided you with a simple text file "mult.txt" that carries out multiplication of two numbers. Entering the following two lines in the OCaml toplevel:
run_program "mult.txt";;
print_registers();;

Should give the following output:
Reg.0 Hex: 0 Dec: 0 Reg.1 Hex: 2A Dec: 42 Reg.2 Hex: 6 Dec: 6 Reg.3 Hex: 0 Dec: 0 Reg.4 Hex: 2A Dec: 42 Reg.5 Hex: 0 Dec: 0 Reg.6 Hex: 0 Dec: 0 Reg.7 Hex: 0 Dec: 0
If you look into the mult.txt file you will notice that not all of the nine assembly instructions are being used. This means that to fully test your implementations, you will have to write your own simple assembly instructions.

Karma

Implement an add subroutine that works with the assembly language you have implemented. Demonstrate its operation by adding 27 to 15. Specifications:

The only add calls you may make in your subroutine are addi calls. You may not use add in your code.
Your program must store the result in register $r3.

Submit your solution as add.txt. We will test your subroutine by using run_program "add.txt".