CS211          Homework 3

Recursive Descent Parsing

Due: Thursday, Oct. 5, at the beginning of class

 

Watch this space for updates!

 

Purpose:

The goal here is to practice the use of recursion and to learn something about the way compilers work.  A Java compiler has to parse Java code in much the same way you have to parse a simple expression for this assignment.

 

Description:

In this homework, you are to write a recursive Java program that parses and evaluates an infix expression, e.g., (1+x)*2-cos(y/3). As you can see, the expression includes a function call (cos()). Function calls are restricted to sin(), cos(), max() and min() for this assignment.  Variables are restricted to x and y.  The expression is evaluated for several values of x and y by the main program.

 

An expression is defined as follows:

            An expression is one or more terms separated by + or –.

            A term is one or more factors separated by * or /.

A factor is a number or a variable or an expression in parentheses or a function call. Function calls use parentheses to enclose their arguments.

 

The above expression example (1+x)*2-cos(y/3), consists of two terms (1+x)*2 and cos(y/3). (1+x)*2 has two factors (1+x) and 2 and cos(y/3) is itself a factor. Inside the parentheses of (1+x), 1+x is an expression which has its own terms and factors. Same with the argument to the cos() function; y/3 itself is an expression.

 

Your job is to create methods that parse and evaluate expressions, terms, and factors.  Each such method takes no arguments and returns a double.  No arguments are needed because the expression being parsed is accessed via a tokenizer (see below).  The double returned by each method corresponds to the result of evaluating the corresponding expression, term, or factor.  See the online lecture notes for Sept 26 for an example using boolean expressions; the code in this example is similar to the code needed for this assignment.

 

Note that if an expression is organized in the above way (using factor, term and expression), the precedence of operators (+, -, *, /) has been automatically taken care of. To see this, note that before we can evaluate an expression containing ‘+’ and ‘-‘, all the terms in that expression must be evaluated; these contain all the ‘*’ and ‘/’. All arithmetic operators (+, -, *, /) in this assignment are left-associative [i.e., 1-2-3 should evaluate as (1-2)-3 and not as 1-(2-3)].

 

Functions sin() and cos() take a double as argument. Functions max() and min() take two doubles as arguments. All of them return a double. You may use the Java methods

double Math.sin( double )

double Math.cos( double )

double Math.max( double, double )

double Math.min( double, double )

to evaluate the corresponding function call.

 

Provided Code:

We have provided both a main program that runs your expression parser/evaluator and a lexical analyzer (ExpressionTokenizer) that divides a String into tokens.

 

Below are the possible tokens for this homework:

            Operators: ‘+’, ‘-’, ‘*’, ‘/’

Delimiters: ‘(’,  ‘,’,  ‘)’

Constants: numbers of type double

            Variables: x and y

Functions: sin(), cos(), max() and min(); the arguments to max() and min() are separated by comma ‘,’

 

To handle numbers, you need the ability to convert a token (a String) into a double; this can be done using Double.valueOf() and Double.doubleValue() (see http://java.sun.com/products/jdk/1.2/docs/api/java/lang/Double.html
for the methods of the Double class and how to use them; see the solution to HW2 (regular) for an example of how these methods are used).  To simplify the assignment, unary-minus (i.e., a minus sign used to indicate a negative number as in “–3”) is disallowed.  Since the Java tokenizers cannot handle exponents on doubles (e.g., the e-part on 6.02e+23), this type of number is also excluded from our expressions.

 

In the main program, expressions are read in via System.in (i.e., via the keyboard). We assume that each expression fits on one line, so input is read in one line at a time. The output (i.e., evaluation of the expression at various x and y locations) is arranged in the format of a square matrix as in the last homework. A command-line argument is used to specify the size of the matrix. To terminate the main program, enter an empty line.

 

A tokenizer, class ExpressionTokenizer, is also provided.  The expression
(1+x)*2-cos(y/3) will be broken into tokens: “(“, “1”, “+”, “x”, “)”, “*”, “2”, “-“, “cos”, “(“, “y”, “/”, “3” and “)”.  In addition, a special end-of-line token is returned (the String “end-of-line”) when the end of the line is reached.  Note that all tokens are of type String.  Spaces are not considered tokens (e.g., “   1 + 2  ” is tokenized as “1”, “+” and “2”).  ExpressionTokenizer is based on java.util.StringTokenizer; you may want to check on StringTokenizer’s documentation to see how it all works.  To use ExpressionTokenizer, create an instance of ExpressionTokenizer using your expression-to-be-parsed as the argument to the constructor.  It has two methods: nextToken() [fetch the next token] and pushBack() [hold the current token so it can be used again].  To help you with debugging, ExpressionTokenizer contains a public boolean field called “debug”.  If this is set to true then each token is printed as it is returned by nextToken( ).  ExpressionTokenizer also has its own main program; look at this for an example showing how ExpressionTokenizer is used.

 

You should write three recursive parsing methods: one to evaluate an expression, one to evaluate a term and one to evaluate a factor. All these methods should access your input String via ExpressionTokenizer (i.e., the input string should not be used directly).

 

Exceptions:

Handle exceptions wisely.  The main program that we provide catches the exceptions that should occur, but you still need to determine when/where to throw exceptions.  The exceptions caught by the main program are NoSuchElementException, NumberFormatException, IllegalArgumentException, and IOException.  A NoSuchElementException is thrown by the tokenizer if an attempt is made to get a token past the “end-of-line” token; this shouldn’t occur unless you have a mistake in your code.  A NumberFormatException is thrown by Double.valueOf() if an attempt is made to convert a String into a Double when the String does not correspond to a legal double.  The IOException is required by Java when we use readLine() in the main program; this shouldn’t ever occur when reading from the keyboard, but Java requires it.  IllegalArgumentExceptions are used to indicate parse errors.  You need to throw an IllegalArgumentException when a parsing error occurs and you need to include a helpful error message when you throw it.  The main program will catch the IllegalArgumentException, print the error message, and go on to the next input line.  To throw an IllegalArgumentException and include a message with it use:

 

throw new IllegalArgumentException(“Error message”);

 

Efficiency:

For this assignment, don’t worry about making your parser efficient.  The main program evaluates each expression several times; it is fine to reparse the expression each time.  To make a more efficient program we could build a parse tree for the expression, but this is beyond the scope of this assignment.

 

Details:

In order to make your parser work properly with the main program we provide, there are some design restrictions.  You must complete a class called Parser. The input expression is communicated to your Parser class via the class’s constructor (see the main program).  The Parser class must include the method:

public double evaluate (double x, double y);

 

The method evaluate() should return the result of parsing and evaluating the input expression using the given values of x and y.

 

What to hand in:

Hand in a hard copy of all the code you write.  Even if CUCOON, the online grading system, is working, we need the hardcopy as backup.

 

Include enough output to convince us that your program works.  If submitting electronically, place the output in a file called “output.txt”.  The output is used for partial credit if for some reason we cannot get your program to run or if we cannot read your disk.

 

Submit an electronic copy of all your source code.  Your Parser class must reside in a file called “Parser.java”. If CUCOON is working close to the due date, you should submit a single .ZIP file containing your code and your output. Otherwise, you are asked to submit a 3.5” floppy disk.

 

How your program will be graded:

We will run your program with various expressions. Your grade will primarily depend on how your program handles our test input, which will include input that should produce error messages.  If your program does not run properly then some partial credit may be awarded based on your hard copy. Here are some expressions that you can use to test your program.

 

x

.3 + .5*y

(1+x)*2

cos(3.14/3)

min( 1.5, y )

2.5*x*y – 1/2

1 +                   ( exception handling )

x*a                   ( exception handling )

max( y )            ( exception handling )

 

Happy coding!

 

 

Provided Code:

The code is provided in separate files:

Parser.java                               Contains the main program.  Fill in the rest of this class.

ExpressionTokenizer.java            Contains the tokenizer for you to use.

 

To download these files, right-click on them and choose "Save Target As...".  Once you have them as local files, remove the .txt extension.  The .txt extension is present so that you can more easily view the files using a web browser.