Cornell University

Spring 2002

- Tuesday, Thursday: 11:40-12:55, Thurston 203

Office Hours |
Office |
||

Instructor | Golan Yona golan@cs.cornell.edu | Tuesday 2:00-3:00 pm Thursday 2:00-3:00 pm | Upson 5156 |

Teaching Assistants | Chee Yong Lee cheeyong@cs.cornell.edu | Tuesday 9:00-10:00 am | Upson 328 |

Aleksandr Gilshteyn ag75@cornell.edu |
Friday 11:00-12:00 am | Upson 328 |

Course Information (.doc)
* (last modified 1/22)*

Checklist (what have we covered so far):

- Introduction, What is Machine Learning ?
- Non-metric methods:
- Concept Learning (candidate-elimination, inductive bias)
- Decision trees (ID3, C4.5, pruning methods)

- Bayesian Learning:
- Bayesian decision theory
- Sequential inference
- ML and Bayesian parameter estimation
- Hypotheses evaluation using Bayes Theorem
- Bayes optimal classifier
- Gibbs algorithm

- Graphical models
- Bayesian belief networks
- The EM algorithm
- Hidden Markov Models - the evaluation and decoding problems
- Hidden Markov Models - the learning problem

- Nonparametric Techniques:
- Density Estimation
- The nearest neighbor algorithm

- Linear discriminant functions:
- LD functions and decision surfaces
- The perceptron criterion function
- The sum-of-squared-error criterion function
- Gradient descent procedures
- Relaxation (error-correcting) procedures
- Least-mean-squared (LMS) procedures (also known as minimum-squared-error MSE procedures)

- Stochastic (single-sample) procedures
- Batch procedures

- Artificial Neural Networks
- Feedforward operation
- Backpropagation algorithm
- Learning curves
- Feature mapping
- Improving performance (practical tips)

- Stochastic methods
- Genetic algorithms
- Genetic programming

- Unsupervised learning
- Mixture densities
- The maximum likelihood estimates
- The iterative EM clustering algorithm
- The k-means clustering algorithm
- Hierarchical/pairwise clustering
- Principal component analysis
- Multi-dimensional scaling

- Hypothesis evaluation
- Sample error vs. true error
- Confidence intervals
- Comparing hypotheses
- Comparing learning algorithms (for a specific target function)
- The minimum description length principle

- Algorithm-independent Machine Learning (general principles of ML)
- The no free lunch theorem
- Bias vs. Variance
- Sampling and validation techniques (jackknife, bootstraping)
- Bagging and Boosting

**Project ideas:**Some project ideas are listed here. Check also last year's projects (.pdf, .ps). Original ideas for projects are most welcome. Graduate students are welcome to suggest a project which is related to their research topic. All projects are practical ("experimental") and involve designing and implementing a learning system.**Project proposal:**one or two paragraphs specifying the problem you are focusing on, the learning system(s) that you are going to apply, any modifications/improvements that you are considering to implement, and the means by which you are going to evaluate your learner (using a benchmark or a validation technique, etc). The goal of the proposal is to make sure that you chose a feasible project, and that you address the important issues. Project proposals are due April 6.**Final project:**Due**May 14**.

Information on what should be in the final report is available here

- The UC-Irvine ML Dataset Archive | The UC-Irvine KDD archive | more datasets
- ML Projects
- The WEKA Machine Learning Project (code for many ML algo's, as well as some datasets)
- Really Cool AI Demos

- Journals
- Journal of ML Research | Data Mining and Knowledge Discovery | Journal of AI Research | IEEE Neural Networks Council (several journals are connected to this page)

- Knowledge Discovery in Databases
- Other University ML classes
- Wisc Madison More External AI References

- Pointers to ML Courses
- Neural Network Resources
- Some ILP Stuff
- Some SVM Stuff
- Machine Learning Benchmarking
- AI Bibliography Server | Neural Networks Bibliography Server (Austrian AI Institute)
- AI Resources (Canadian NRC Server)
- Aha's ML Links
- Stuart Russell's:AI on the Web (loads of links)
- Reinforcement Learning Repository