CS 4120: Introduction to Compilers
Fall 2011

Programming Assignment 1: Lexical Analysis

due: Wednesday, September 7

In this assignment you will implement a lexer (also called a scanner or a tokenizer) for the Xi programming language. As discussed in lecture 2, a lexer provides a stream of tokens (also called symbols or lexemes) given a stream of characters.

Your lexer must implement the interface edu.cornell.cs.cs4120.xi.lexer.Lexer. We encourage you to use a lexer generator such as JFlex in your implementation, but this is not required. If you do use a lexer generator, you may wish to consider using the adapter pattern to aid you in your implementation.

We also ask that you provide an implementation of the interface edu.cornell.cs.cs4120.testing.LexerFactory, which simply constructs an instance of your lexer that reads from a provided java.io.Reader. Your implementation class must be public and have a no-argument constructor. You may choose to additionally implement the interface edu.cornell.cs.cs4120.testing.CompilationUnitLexerFactory, which is identical except that it also allows the name of the compilation unit to be passed in.

Finally, we ask that you submit an overview document describing your work. In the metadata section of the document, please include the fully-qualified class name of your LexerFactory implementation.

Provided code

Lexer, LexerFactory and all other code that you will need for this assignment are packaged in cs4120-pa1.jar. Download this file and add it to your classpath. The source code is available in cs4120-pa1.zip.

For this assignment, you will find that the provided interfaces largely determine the design of your code. We hope that this will simplify the process of completing the assignment and building upon it for the next assignment. You should anticipate that the code provided for subsequent assignments will provide less guidance.

Source control

You are expected to use Subversion as a source control tool. We will be setting up project accounts that you can use for this purpose.

Package names

All Java code that you submit must be contained within a Java package (your code may be contained in multiple packages). Please ensure that all code you submit is contained in a package whose name contains the NetId of at least one of your group members. Your packages can be named however you would like as long as you satisfy this constraint.

Submission instructions

Submit the following:

Planning ahead

This assignment is much smaller than future assignments will be: it is intended primarily as a warmup assignment that gives your group the chance to practice working together. The later assignments will test your ability to work effectively as a group, so this is a great time to learn how to work together as a group. It is also a good time to set up the infrastructure that you will use for the rest of the semester.