Announcements
05/09/02
Please schedule your final project presentation using this
form
05/05/2003
The final project is due May 12 at 4pm.
The final exam is on May 14 (12:00-2:30pm, Phillips 219).
Presentations of projects will be scheduled for May 15 and 16
04/25/2003
Solution to assignment4 is out. Due..
01/21/03
Please note that most questions (regarding assignments/projects/general questions about the course)
should be posted to the newsgroup, not sent by email.
If the link above does not work for you, here is how to connect to the newsgroup using Microsoft Outlook Express:
- Go to Tools|Accounts.
- A dialog box will appear. Click on the Add button and select "news".
- Fill in your nickname, email address and use "newsstand.cit.cornell.edu" for the news server.
A folder named "newsstand.cit.cornell.edu" will be created.
- Right click on the folder, select Property. In the "server" tab, check "the server requires me to log on".
Use your netid for the account name, and your Bear Access network password for the password field.
- Click on Tools|newsgroup to download the list of
newsgroup on the server. Add "cornell.class.cs478" to the list of subscribed
newsgroup.
Additional information on how to access Cornell’s news server using Bear Access and other news application can
be obtained here
Time and Place
- Tuesday, Thursday: 11:40-12:55, Thurston 203
Personnel
Course Syllabus (.ps,
.pdf)
Course Information (.doc)
(last modified 1/22)
Academic integrity policy
Checklist (what have we
covered so far):
- Introduction, What is Machine Learning ?
- Non-metric methods:
- Concept Learning (candidate-elimination, inductive bias)
- Decision trees (ID3, C4.5, pruning methods)
- Bayesian Learning:
- Bayesian decision theory
- Sequential inference
- ML and Bayesian parameter estimation
- Hypotheses evaluation using Bayes Theorem
- Bayes optimal classifier
- Gibbs algorithm
- Graphical models
- Bayesian belief networks
- Hidden Markov Models - the evaluation and decoding problems
- Hidden Markov Models - the learning problem
- The EM algorithm
- Nonparametric Techniques:
- Density Estimation
- The nearest neighbor algorithm
- Linear discriminant functions:
- LD functions and decision surfaces
- The perceptron criterion function
- The sum-of-squared-error criterion function
- Gradient descent procedures
- Relaxation (error-correcting) procedures
- Least-mean-squared (LMS) procedures (also known
as minimum-squared-error MSE procedures)
- Artificial Neural Networks
- Feedforward operation
- Backpropagation algorithm
- Learning curves
- Feature mapping
- Improving performance (practical tips)
- Stochastic methods
- Genetic algorithms
- Genetic programming
- Unsupervised learning
- Mixture densities
- The maximum likelihood estimates
- The iterative EM clustering algorithm
- The k-means clustering algorithm
- Hierarchical/pairwise clustering
- Principal component analysis
- Multi-dimensional scaling
- Hypothesis evaluation
- Sample error vs. true error
- Confidence intervals
- Comparing hypotheses
- Comparing learning algorithms (for a specific target function)
- The minimum description length principle
- Algorithm-independent Machine Learning (general principles of ML)
- The no free lunch theorem
- Bias vs. Variance
- Sampling and validation techniques (jackknife, bootstraping)
- Bagging and Boosting
Introduction [mostly Mitchell Ch1] (.ps, .pdf)
Concept Learning [Mitchell Ch2] (.ps
, .pdf)
Decision Trees - part 1 [Mitchell Ch3, Duda/Hart/Stork Ch8]
(.ps
, .pdf)
Decision Trees - part 2 [Mitchell Ch3, Duda/Hart/Stork Ch8]
(.ps
, .pdf)
Bayesian decision theory - part 1 [Duda/Hart/Stork Ch2]
(.ps
, .pdf)
Bayesian decision theory - part 2 [Duda/Hart/Stork Ch2]
(.ps
, .pdf)
Added 03/01/2003
Bayesian decision theory (sequential inference) - part 3
(.ps
, .pdf)
Bayesian learning theory - part 4 [partly from Duda/Hart/Stork Ch3]
(.ps
, .pdf)
Bayesian learning theory - part 5 [mostly Mitchell Ch6]
(.ps
, .pdf)
Bayesian learning theory - part 6 [mostly Mitchell Ch6]
(.ps
, .pdf)
Bayesian networks [mostly Duda/Hart/Stork chapter 2, Mitchell Ch6]
(.ps ,
.pdf)
Hidden Markov Models - part 1 [partly Duda/Hart/Stork chapter 3]
(.ps ,
.pdf)
Hidden Markov Models - part 2 [partly Duda/Hart/Stork chapter 3]
(.ps ,
.pdf)
The EM algorithm
(.ps ,
.pdf)
Nonparametric Techniques [Duda/Hart/Stork Ch4, Mitchell Ch8]
(.ps ,
.pdf)
Linear Discriminant Functions [Duda/Hart/Stork Ch5]
(.ps ,
.pdf)
Artificial Neural Networks I [Duda/Hart/Stork Ch6, Mitchell Ch4]
(.ps ,
.pdf)
Artificial Neural Networks II [Duda/Hart/Stork Ch6, Mitchell Ch4]
(.ps ,
.pdf)
Stochastic methods (genetic algorithms) [Mitchell Ch9, Duda/Hart/Stork Ch7]
(.ps ,
.pdf)
Unsupervised learning I - clustering algorithms [Duda/Hart/Stork Ch10]
(.ps ,
.pdf)
Unsupervised learning II - dimensionality reduction algorithms [Duda/Hart/Stork Ch10]
(.ps ,
.pdf)
Hypothesis evaluation [mostly Mitchell Ch5]
(.ps ,
.pdf)
Algorithm-independent Machine Learning I [Duda/Hart/Stork Ch9]
(.ps ,
.pdf)
Algorithm-independent Machine Learning II [Duda/Hart/Stork Ch9]
(.ps ,
.pdf)
Machine Learning - Overview
(.ps ,
.pdf)
Assignments
Note: you may work on the assignment with (one) another student, but
you have to submit the assignment separately, using your own
words. Acknowledge the other student with whom you worked on the
assignment.
-
Assignment #1 (ps,pdf)
Solution of assignment 1: part A (problems 1,4)
word
and part B (problems 2,3,5,6,7) ps,
pdf
-
Assignment #2 (ps,pdf)
Solution of assignment 2 ps,
pdf
-
Assignment #3
due March 14 at 4pm
-
Assignment #4 (ps,pdf)
Due April 1st.
Sample input file. The pattern is of length 10. The output format should be:
sequence 1 position 30 THEPATTERN
sequence 2 position 12 THEPATTERN
...
sequence n position 23 THEPATTERN
likelihood ratio: 57.2
New tests for the Gibbs sampling algorithm: report your results on these two
files test1 (L=10),
test2 (L=18)
04/25/2003
Solution of assignment 4 - part1 ps,
pdf, and part2 code
-
Assignment #5
due April 15 at 11am
- Assignment #6 (ps,
pdf) due April 23
sample data, output format
The MDL principle (ps,pdf)
Final Project
You are encouraged to work on the project in couples. Please register
as soon as you know who is your partner.
- Project ideas: Some project ideas are listed here. Check also
projects from previous years (2001,
2002). Original ideas for projects
are most welcome. Graduate students are welcome to suggest a
project which is related to their research topic.
All projects are practical ("experimental") and involve designing and
implementing a learning system.
- Project proposal: one or two paragraphs
specifying the problem you are focusing on, the learning system(s)
that you are going to apply,
any modifications/improvements that you are considering to implement,
and the means by which you are going to evaluate your learner (using a
benchmark or a validation technique, etc).
The goal of the proposal is to
make sure that you chose a feasible project, and that you address the
important issues. Project proposals are due March 28.
- Final project: Due Early May (date TBA).
Information on what should be in the final report is available
here