CS312 Lecture 1:  Course Overview, Background on ML

Who are we?

Prof. Zabih and a staff of about 14. See web page for details: http://www.cs.cornell.edu/courses/cs312 (single most important piece of info in today's lecture). You will meet the rest of the staff in section and in consulting.

What is CS312 About?

CS312 is the third programming course in the Computer Science curriculum, following CS100 and CS211.  The primary goal of the course is to give students a firm foundation in the fundamental principles of programming and computer science.   Consequently, CS312 covers a broad set of topics including

  1. alternative programming paradigms (beyond imperative and object-oriented programming)
  2. key data structures and algorithms
  3. reasoning about program behavior and complexity
  4. type systems and data abstraction
  5. the design and implementation of programming languages

A major goal in CS312 is to teach you how to program well.  Just about anyone can learn how to throw code together and get simple programs running, but it takes a deep understanding of the principles of computer science to write truly elegant and efficient programs with lasting value. We will try to give you that understanding and teach you some of the craft of programming as well. And practice makes perfect.

Some notes on programming and programming languages

Lots of people vastly overstate the importance of knowing 1 computer language versus another. In particular, students tend to want to know "What language is the course in?" This is actually a fairly dull question.  It's like worrying about what book you use when you first learn to read (Dick and Jane? Dr. Seuss?) In fact, it's actually fairly silly to even list the computer languages you know on your CV; most professors can't answer this question!

There is an important reason for this: computer languages have a lot in common.  If you know almost any language really well, you can pick up any other language in a few days.  If only foreign languages were so easy (based upon my expertise in Portugese, I will learn Chinese in under a week...)

The key words here, though, are really well.This means having a good mental model of what the computer does with a program. How can you tell if you don't have a good model? Suppose your program isn't quite working (it gives the number you want, but is often off by 1, i.e. fencepost error somewhere). If you don't have a good mental model, you tend to change some <'s to <='s.  Why is this bad?  There are LOTS of wrong programs out there, and very few right ones. What are the chances that you will stumble onto the right one?  The lottery has MUCH better odds.

If you have a good mental model, you can just look at the program and think. It's harder, but much more likely to succeed.

VERY important piece of advice: if you find yourself typing instead of thinking, stop. This is probably your first course with just CS majors, and therefore the assignments will be a lot more work.  You can trash your life (i.e., pull multiple all-nighters) if you try typing instead of thinking.

CS312 slogan #1 (first of many): Thinking is better than typing.

Many of the problem sets, and all of the exams (very important!) have short, elegant answers. It's in your interest to find it.

Given this, you should learn one language well. Ideally, that language should have a simple and elegant model!

Our choice: SML

We use the Standard ML (SML) programming language throughout the course.  SML is a modern functional programming language with an advanced type and module system.   The course is not about programming in SML.  Rather, SML provides a convenient framework in which we can achieve the objectives of the course.  Like the object-oriented model of Java, the functional paradigm of SML is an important programming model with which all students should be familiar, as it underlies the core of almost any high-level programming language. In addition the SML type and module systems provide frameworks for ensuring code is modular, correct, reusable, and elegant.  The lessons you learn in programming with SML will be applicable to other programming languages such as Java. By studying alternative ways to write programs, you will be better equipped to use, implement or even design future programming environments.

Another important reason we use SML is that it has a relatively clean and simple model that makes it easier to reason about the correctness of programs.  Indeed, SML was one of the first major programming languages to have a formal semantic definition.   In our studies, we will see that we can reason formally about the functional correctness of code, and also about the space, time, and other resources used in a computation.

Lectures and Recitations

Lectures are Tuesday and Thursday, 10:10 to 11am in Kimball B11. Recitations are Monday and Wednesday at four times (see web page). You are expected to attend both lectures and recitations. You may attend any recitation you want to, but it's probably in your interest to stick with one. Feel free to load-balance.

Course Materials

There is no official textbook for the course. The following books are useful and on reserve at the Engineering Library:

Two convenient online sources that we will be using from time to time are:

Communication

Course web site

The course web site is at http://www.cs.cornell.edu/courses/cs312. You should keep a close eye on this web page. We will post announcements about the course there. The programming assignments will all be posted there too.

Newsgroup

The best way to reach the course staff is by posting questions or comments to the course newsgroup, cornell.class.cs312. There are many members of the course staff reading the newsgroup who can answer your questions. Read the guidelines on the web page for some tips about the newsgroup etiquette.

Email

For questions that would be inappropriate to post to the newsgroup, you can also reach the course staff by sending mail to cs312@cs.cornell.edu. The newsgroup is preferred, however.

Consulting hours

The TAs have regular office hours during the day, consultants have evening consulting hours.  Office hours are on the web. Consulting hours are 7-10pm Sunday through Wednesday in Upson 304A, unless otherwise announced. The night before every project is due (not the night that it is due), we will hold extended consulting hours from 7pm-12 midnight. Consulting hours will not be held the day after a problem set is due.

Coursework

Problem Sets

The work in this class will consist of five problem sets. The first of these problem sets will be available on the course web site Monday. It is due in one week: 11:59PM the next Tuesday 9/10. Some problems sets will have written exercises as well as programs to write. The written exercises will in general be due at 4pm on the due date.

Software

You can download a copy of SML of New Jersey from the course web site. This include the Emacs editing environment that you will use to interact with SML and do your programming and debugging.

We will have four sessions demoing this environment next week. Keep your eye on the course web site for updates about the demos.

Prelims & Final

There will be two prelims, October 17 and November 19, held in the evenings. Location is on web site.

The final is December 13.

Make-up exams are oral; let's try not to have them.

Grading

Last year: 30% A, 40% B, 30% C or less (mostly C). Past performance is no guarantee of future outcomes.

Everything counts, but exams count more. Especially the final, since I have it in front of me when I assign grades.

Background on ML

Our first order of business in this course is to learn how to use ML. Why learn another language?

We use a zillion different programming languages to communicate with machines and each other:

Though there are only a handful of general-purpose languages that you will learn and use, you'll be learning and using special-purpose languages for the rest of your life.  Even general-purpose languages come and go.  Today, it's Java and C++.  Yesterday, it was Pascal and C, before that Fortran and Lisp.  Who knows what it will be like tomorrow? You have to learn how to learn new languages.

In addition, many projects will require that you build "little" languages for gluing things together.

We gain a lot of leverage by having good notation and good language support for a given domain.  

So it's important to understand programming models and programming paradigms because in this fast changing field, you need to be able to rapidly adapt.

It's crucial that you understand the principles behind programming that transcend the specifics of today.

There's no better way to get at these principles than to approach programming from a completely different perspective.

This is one reason why we're using ML -- it's very different from what most of you will have seen.


A great general-purpose programming language:

Fact:  there are thousands of general purpose languages.

Corollary:  there are no great programming languages.  

But there are some pretty good ones.  Java and ML are pretty good general-purpose languages (at least when compared to their predecessors.)


SML is a functional programming language.

SML is a statically typed, type-safe programming language.

SML (and SML/NJ in particular) supports a number of advanced features.


Some history

    (see Paulson's book for more info):

Robin Milner and others at the Edinburgh (Scotland) Laboratory for Computer Science were working on theorem provers in the late '70s and early '80s. 

Traditionally, theorem provers were implemented in languages such as Lisp.

Milner kept running into the problem that the theorem provers would sometimes put incorrect "proofs" (i.e., non-proofs) together and claim that they were valid.

So he tried to develop a language that only allowed you to construct valid proofs.

"ML" which stands for "Meta Language" was the result of his (and others') work.  The type system of ML was carefully constructed so that you could only construct valid proofs in the language.  A theorem prover was then written as a program that constructed a proof.

Milner also formulated the type-inference system of ML, and proved its soundness.

(It should be noted that Milner also worked on concurrent programming languages, such as CCS, CSP, and the pi-Calculus and later went to receive the Turing Award -- the computer science equivalent of a Nobel Prize -- in large part for his work on ML

Eventually, this Classic ML evolved into a full-fledged programming language.

In the early '80s, there was a schism in the ML community with the French on one side and the British and US on another.  The French went on to develop CAML and later Objective CAML (O'caml) while the Brits and Americans developed Standard ML.  The two languages are actually quite similar.

What is ML used for today?

In truth, not a lot when compared to something like C, C++, or Java.  ML's real strength lies in language manipulation (i.e., compilers, analyzers, verifiers, provers, etc.)  This is not surprising since ML evolved from the domain of theorem proving.