- Results from the 2023 Compiler Bakeoff are now available.
- Frequently asked questions:
- I didn't enroll in the course in December. Can I still join the course? Almost certainly.
- When does CS 4121/5121 meet? At the same time as CS 4120/5120.
- Is there any difference between 4120 and 5120? The latter is for MEng students and requires a little more work on the project and homework assignments.
- Is this course going to be similar to last time? Yes.
- When will this course be offered next? Not certain, but probably in Spring 2025.
This course offers an introduction to the specification and implementation of modern compilers. Topics covered include lexical scanning, parsing, type checking, code generation and translation, an introduction to optimization, and compile-time and run-time support for modern programming languages. As part of the course, students build an optimizing compiler for a simple language.
Placeholder for staff
The best way to reach the course staff is normally by posting on the discussion forum. We will be using Ed Discussions for this purpose. However, for private correspondence you can email individual staff members. Please do not ask multiple course staff members the same question privately via email; that just wastes our time. Ed is normally the right way to ask questions about course content or assignments.
Computer Science 3110 and either CS 3410 or 3420. The practicum (CS 4121 or 5121) is a required co-requisite. You may not take CS 4120 without taking CS 4121 too, and similarly for CS 5120 and CS 5121. The reason for this is that the group project is part of the grade for both 4120 and 4121. Familiarity with programming in Java is also expected.
Most project groups choose to use Java to implement their compiler; however, other languages may be used with the permission of the instructor. It is strongly encouraged to use a language with a strong, static type system and automatic garbage collection.
Lectures are MWF 2:40–3:30 in Phillips Hall 101.
You are required to read the course notes posted on the web site. These will often contain more detail than what was presented in lecture.
There is no required textbook this semester, but the following textbooks will be helpful sources of information. All of these books will be available on reserve in Uris Library.
- Modern Compiler Implementation in Java, 2nd ed. Andrew Appel and Jens Palsberg, Cambridge University Press, 2002. ISBN 0-521-82060-X.
- Compilers—Principles, Techniques and Tools (The “Dragon Book”), 2nd ed. Alfred Aho, Monica Lam, Ravi Sethi and Jeffrey D. Ullman. Addison-Wesley, 2006. ISBN: 0-321-48681-1
- Engineering a Compiler. Keith Cooper and Linda Torczon. Elsevier/Morgan Kaufmann, 2004. ISBN 1-55860-698-8.
- Advanced Compiler Design and Implementation. Steve Muchnick. Morgan Kaufmann Publishers, 1997. ISBN 1-558-60320-4.
- The Java Language Specification. James Gosling, Bill Joy, and Guy Steele. Addison-Wesley, 1996. ISBN 0-201-63451-1
Another useful text, on linking and loading, is:
Here are a couple of useful books on coding and software engineering:
- Design Patterns: Elements of Reusable Object-Oriented Software. Gamma, Helm, Johnson, Vlissides, Booch. Addison-Wesley, 1995. ISBN 0-201-63361-2
- Refactoring: Improving the Design of Existing Code. Martin Fowler. Addison-Wesley, 1999. ISBN 0-201-48567-2.
Assignments and Grading
There are four written homework assignments. You may discuss assignments with other students, but the work must be done on an individual basis. The names of any students you discussed the problems with must be recorded with the corresponding problems.
The compiler project is divided up into six programming assignments that are due at various points throughout the term. Compiler projects will be performed by groups of three or four students. The same groups will be maintained throughout the semester, if possible. Peer evaluations will be used to adjust scores on the programming assignments.
Scores will be averaged using quadratic weighting, which means that unrepresentatively low scores (such as missing assignments) will have less weight.
Except in unusual circumstances, you will receive the same grade in CS 4120 and CS 4121 (resp. 5120 and 5121), and all members of a group will receive the same grade on programming assignments. Exceptions to these rules are dealt with on a case-by-case basis.
Participation includes attending and participating in class (either in-person or remotely), asking good questions on the forum, answering questions on the forum, and filling out the course evaluation form.
The breakdown of points per assignment is as follows:
|Homework: 15%||Homework 1||4|
|Programming Assignments: 42%||Programming assignment 1||5|
|Programming assignment 2||6|
|Programming assignment 3||6|
|Programming assignment 4||7|
|Programming assignment 5||8|
|Programming assignment 6||10|
|Exams: 40%||Prelim 1||15|
The weighted quadratic mean of these scores will be used to determine the final score in the course. The main effect of a quadratic mean is to make unusually low scores count for less and unusually high scores count for more. It usually affects the grades of at most a handful of students.
Assignment late penalties:
We are relatively flexible about submitting assignments late. You will have four slip days that may be used during the semester to submit work without penalty. Once slip days are exhausted, a 10% per day penalty is applied to late work. Any penalties applied will be on an individual basis. Weekends are considered to be a single day for the purpose of computing slip days. Unless otherwise specified, two slip days may be used per assignment without penalty as long as the slip day total does not go above four.
Slip day usage may be avoided or reduced by obtaining an extension on the assignment. Extension requests should be made via email; copying all group members on the request. They should include a description of why an extension is being requested. Extension requests based on foreseeable events or minor illness will normally be denied; you are expected to handle such events through prior planning or use of slip days.
The prelims will cover material from the textbook, lectures, and homework assignments. Both prelims will be take-home exams. There will be no final exam, but your final report and demo will be due at the appropriate project due date for the course during finals weeks.
We expect you to participate in class and elsewhere. 2% of the score will be for class participation (in-class, Ed, course evals) and for possible in-class pop quizzes.
The Cornell Code of Academic Integrity will be strictly enforced in this class. A Cornell student's submission of work for academic credit indicates that the work is the student's own. All outside assistance must be acknowledged, and students' academic position must be truthfully reported at all times. We take cheating and fraud very seriously.
Placeholder for schedule
- Java CUP 11b extended with counterexample generation (from Polyglot).
- JFlex is a good lexer generator that we encourage you to use. The Compiler Explorer is a useful tool for seeing what assembly code is generated by a wide variety of compilers.
- You will be generating a lot of graphs and trees in this course. Graphviz is a useful tool for visualizing them.
- The Polyglot parser generator (PPG) supports counterexamples and grammar inheritance.
- EasyIO is a library for scanning input, with support for lookahead and backtracking.
The default programming language for this course is Java, though some project groups may choose to use other programming languages. Java is a fairly good programming language for building a compiler. Its strong static typing, automatic garbage collection, and modular data abstraction are all very helpful when building complex software such as compilers. The tool support for Java is also excellent.
The Java API is very useful for learning how to use the many existing Java class libraries.
The Java Language Specification is helpful if you want to really understand how Java works.
For students with limited Java experience, we recommend the online notes from CS 1130, Transition to Object-oriented Programming as a refresher. This is a self-paced course consisting of several modules that you can go through at your leisure.
Review the introductory chapters in the textbook and the Java reference books listed on the course info page.
See Oracle's official Java Tutorial.
There are many valuable resources that can help you take your programming skills to the next level. Here are a few links:
Software Development Methodologies
Just for Fun
- Teach Yourself Programming in Ten Years
- Quines (Self-Reproducing Programs)
- Programming Quotes
- How To Become A Hacker
- How To Write Unmaintainable Code
- Software Bugs & Glitches
- Doom for System Administration
- The International Obfuscated C Code Contest
- The Easter Egg Archive
- Esoteric Programming Languages
The Cornell University Library runs several computer labs across campus for all members of the Cornell community. The JDK and Eclipse are installed on these machines. Check here for locations and times of operation.
Academic Excellence Workshops
The Academic Excellence Workshops (AEW) offer an opportunity for students to gain additional experience with course concepts in a cooperative learning environment. Research has shown that cooperative and collaborative methods promote higher grades, greater persistence, and deeper comprehension. The material presented in the workshop is at or above the level of the regular course. We do not require joining the AEW program, but do encourage students to join if they are seeking an exciting and fun way to learn. The AEW carries one S/U credit based on participation and attendance. The time commitment is two hours per week in the lab. No homework will be given. This is a wonderful opportunity for students to seek extra help on course topics in a small group setting.
Your fellow undergraduate students, who are familiar with the course material, teach the sessions with material that they prepare. The course staff provide guidance and support but do not actually teach the AEW course content or any session. A representative from the AEW program will be speaking about the program and registration procedures in lecture.
Your AEW liaison for this semester is Jason Zhao.
See the AEW webpage for further information.
Other Support Services
|Student Web Services||This website collects services that are more general.|
|Engineering Advising||Academic advising for engineering students.|
|Arts College Student Services||A listing of general support services for a variety of concerns students may have.|
|Learning Styles||Not everyone learns the same way. If you are curious about how you learn, check out this collection.|
|Cornell Health||Cornell Health provides services for students on Cornell's Ithaca campus. For all health related concerns.|
|CAPS||If you are experiencing emotional distress, we urge you to contact CAPS, the Counseling and Psychological Services.|
- Homework 1: Lexical Analysis
- Homework 2: Syntactic Analysis
- Homework 3: Semantic Analysis
- Homework 4: Program Analysis
- Programming Assignment 1: Implementing Lexical Analysis
- Programming Assignment 2: Implementing Syntactic Analysis [pretty-printer documentation]
- Programming Assignment 3: Implementing Semantic Analysis
- Programming Assignment 4: Intermediate Code Generation
- Programming Assignment 5: Assembly Code Generation
- Programming Assignment 6: Optimization and Extension
Other resources for programming assignments:
- How to lose in CS 4120
- How to pick an implementation language for your project
- Overview document requirements for group project assignments
- Eta language specification
- Simple Eta code examples
- Eta syntax highlighting support for vim.
- etac test harness
- Docker VM Setup Instructions
- Eta type system specification (source)
- Eta ABI specification
- Eta runtime
- Rho language specification
- Rho example programs: snake, mandelbrot
- RhoO language specification (optional)