Problem Set 2: Folding and Datatypes

Due 2/12/09

Instructions

This problem set has four parts. You will write your solution to part one in the file folding.ml; you will write your solution to the other three parts in the file lecture.ml. To get you started, we have provided these two stub files. You should download and edit these files. Once you have finished, submit your solution using CMS, linked from the course website.


Three important notes about grading:

  1. Compile errors: All programs that you submit must compile. Programs that do not compile will probably receive an automatic zero. If you are having trouble getting your assignment to compile, please visit consulting hours. If you run out of time, it is better to comment out the parts that do not compile and hand in a file that compiles, rather than handing in a more complete file that does not compile.
  2. Missing functions: We will be using an automatic grading script, so it is crucial that you name your functions and order their arguments according to the problem set instructions, and that you place the functions in the correct files. Otherwise you may not receive credit for a function properly written.
  3. Code style: Finally, please pay attention to style. Refer to the CS 3110 style guide and lecture notes. Ugly code that is functionally correct may still lose points. Take the extra time to think out the problems and find the most elegant solutions before coding them up.

Part 1: Folding (24 points)

Folding is one of the most useful tools in functional programming and you will certainly use it many times over the course of the semester. The following exercises are designed to get you comfortable with folding to do useful computations on lists. Note: All of these functions can be written without folding, but the point of the exercise is to get used to folding, so every function must include either List.fold_left or List.fold_right.

Here we introduce type student = int * float * (char option). Abstraction Function: the int in the tuple represents the student's id number, the float represents the raw numerical score and the char option represents the letter grade. This datatype will be used throughout the problem set.

  1. (5 pts) Using folding, write a function getMeanList : student list -> float that calculates the mean score of all students in the class.
  2. (5 pts) Using folding, write a function splitList : student list -> (student list) * (student list) that splits a list of students into two separate lists, with all of the students in odd positions in the original list being placed into the first list, and all of the students in even positions being placed into the second list. (Note: you may not use the @ operator)
  3. (7 pts) Using folding, write a function allSuffixes: 'a list -> 'a list list that returns a list of all non-empty suffixes of the given list, ordered from longest suffix to shortest suffix. For example, allSuffixes[1;2;3;4] = [[1;2;3;4]; [2;3;4]; [3;4]; [4]].
  4. (7 pts) Using folding, write a function allPrefixes: 'a list -> 'a list list that returns a list of all non-empty prefixes of the given list, ordered from shortest prefix to longest prefix. For example, allPrefixes [1;2;3;4] = [[1]; [1;2]; [1;2;3]; [1;2;3;4]].

Part 2: Binary Search Tree (52 points)

In this section you will be implementing an efficient way to store the grades of students in a lecture for easy access and processing, in this case as a binary search tree. Your binary search tree should order students by id number. Recall that when an element is added to a binary search tree, if it is larger than the root, then it is recursively added to the right subtree, and if it is smaller than the root, it is recursively added to the left subtree. Hence, an inorder traversal of the tree would operate on students in ascending order of id number. Since no two students have the same id number, attempting to insert a student whose id number is already in the tree should raise an error. We have provided:

type bST = Empty | Tree of node

and node = {data: student; left: bST; right:bST} .

  1. (5 pts) Write a function add : bST -> student -> bST that adds a student into a bST and returns the new bST . Remember, if the student is already in the tree, your code should raise an error.
  2. (5 pts) Write a function get : bST -> int -> student option that takes in a tree and an id number and returns the student associated with that number if there is one, and None if there is not.
  3. (20 pts) Write a function fold : ('a -> student -> 'a) -> 'a -> bST -> 'a that applies the given function to each node in the tree and the accumulator, then returns the final value of the accumulator. This fold should be an inorder traversal of the binary search tree, i.e. it should be applied to students in ascending order of id number. You would call fold as follows: fold function accumulator tree. function is a function that takes in an accumulator of type 'a and a student and returns the accumulator of type 'a. accumulator is the accumulator of type 'a. tree is the bST you are folding over. Your code should be less than five lines.
  4. (5 pts) Write a function toList: bST -> student list that folds over the binary search tree and returns a list of students in order by id number (you must use your fold function from #3, and you may not use the @ operator)). Your code should be less than five lines.
  5. (5 pts) Write a function getNumStudentsBST: bST -> int that folds over the binary search tree and returns the number of students (you must use your fold function from #3). Your code should be less than five lines.
  6. (5 pts) Write a function getMeanBST : bST -> float that folds over the binary search tree and returns the mean numerical score (you must use your fold function from #3). Your code should be less than five lines.
  7. (5 pts) Write a function getVariance : bST -> float that folds over the binary search tree and returns the variance of the numerical scores (you must use your fold function from #3). The variance is defined as the average squared distance from the mean. So, to calculate the variance, you must sum the squares of the difference between each point and the mean, then divide by the total number of points in the distribution. Your code should be less than five lines.
  8. (2 pts) Write a function getStdev : bST -> float that returns the standard deviation of the numerical scores (the square root of the variance, so it would be wise to use your getVariance function from #4). Your code should be less than five lines.

Part 3: Assigning Grades in a Lecture (24 points)

Now we need to assign a letter grade to each student in the lecture. We have provided:

type lecture = {students: bST; mean: float; sdev:float}

  1. (4 pts) Write a function formLecture : student list -> lecture that takes in a list of students and returns a lecture, meaning you need to compute the mean and the standard deviation and include them in the returned record (You should call the functions you wrote for Part 2).
  2. (10 pts) Write a function mapToZScores: lecture ->(student * float) list that folds over the students in the lecture and outputs a list with a tuple for each student and their z score. A z score is a statiscal concept that measures how many standard deviations a point in a distribution is from the mean of the distribution. So, if you have a score and want to find the z score, just subtract the mean from the score and divide by the standard deviations (z scores can, and will often, be negative). Again, you should be calling function(s) from Part 2.
  3. (10 pts) Write a function assignGrades : (student * float) list -> student list that takes in a list of (student,float) tuples, where the float represents a z score, and outputs a list of students with the char option in the student tuple being Some(letter grade) instead of None. Assign the grades as follows: z score less than -1.2 is an F, z score greater than or equal to -1.2 and less than -.4 is a D, z score greater than or equal to -.4 and less than .4 is a C, z score greater than or equal to .4 but less than 1.2 is a B, and a z score greater than 1.2 is an A. You must use a List fold method in your implementation.

Part 4: MergeSort (Karma problem)

In this karma problem, you will write a mergeSort that sorts the list of students produced at the end of Part 3 by their letter grades.

  1. (Karma) Write a function compareByGrade : (student * student) -> int that takes in two students and compares the grade, represented by the char option (the third element in the tuple), of the first student to the second student. If the first student has a better grade it should return 1. If the two students have the same grade, it should return 0. If the second student has a better grade, it should return -1. The grades are A,B,C,D, and F. Hint: You may use Char.compare. Note: if one or both of the students has a value of None in the char option, your code should raise an error.
  2. (Karma) Write a function mergeSort : student list -> student list that takes in a list of students, performs a merge sort on the students and returns a list of students ordered by their letter grade (ordered with As at the head of the list and Fs at the tail).