Spring 2007 Topic: Learning, Games, and Electronic Markets

Overview

In many contemporary applications of algorithms, it is necessary to design algorithms which make better and better decisions as information is revealed to them over time. The central theme of this course will be a family of algorithms for online learning and prediction, and the applications of these algorithms to game theory and the design of electronic markets. The emphasis will be on mathematical techniques for designing, analyzing, and applying these algorithms. Topics will include algorithms for predicting from expert advice, adaptive game playing, Nash equilibrium and correlated equilibrium, evolutionary game theory, multi-armed bandit problems, Markov decision processes, optimal stopping, pricing and auction theory, universal portfolios, collaborative learning and reputation mechanisms.

## InstructorRobert Kleinberg4138 Upson Phone: 255-9200 |
## Time and PlaceMWF 2:30-3:20165 Olin Hall |
## Office HoursWeds. 3:30-4:30Thurs. 3:00-4:00 ...or by appointment. |

Prerequisites

Familiarity with basic notions from algorithms and elementary probability theory, and competence with reading and writing mathematical proofs. If you have not taken at least one class in algorithms (at the level of CS 482) and at least one class in probability (at the level of Math 471) you should seek permission from the instructor.

Announcements

- Useful readings for weeks 13, 14:
- Near-optimal online auctions, A. Blum and J. Hartline, SODA 2005.
- The value of knowing a demand curve: bounds for regret for online posted-price auctions, R. Kleinberg and T. Leighton, FOCS 2003.
- Approximation algorithms going online, S. Kakade, A. Kalai, and K. Liggett, STOC 2007.

- Lecture notes for week 9 are now online.
- Homework #4 is now online. The homework is due at 17:00 on Monday, April 23.
- Useful readings for weeks 11, 12:
- Adaptive routing with end-to-end feedback: Distributed learning and geometric approaches, B. Awerbuch and R. Kleinberg, STOC 2004.
- Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary , H. B. McMahan and A. Blum. Proc. 17th Annual Conference on Learning Theory (COLT 2004).
- Robbing the bandit: Less regret in online geometric optimization against an adaptive adversary , V. Dani and T. P. Hayes. Proc. 17th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2006), 937-943.
- Anytime algorithms for multi-armed bandit problems, R. Kleinberg. Proc. 17th ACM-SIAM Symposium on Discrete Algorithms (SODA 2006), pp. 928-936.
- Competitive collaborative learning, B. Awerbuch and R. Kleinberg. Proc. 18th Annual Conference on Learning Theory (COLT 2005), pp. 233-248.

- Useful reading for week 10:
- On Playing Golf with Two Golf Balls, I. Dumitriu, P. Tetali, and P. Winkler, SIAM J. Discrete Math. 16 (2003), 604-615.

- Homework #3 is now online. The homework is due at 17:00 on April 6.
- Lecture notes for week 8 are now online.
- Useful readings for week 9:
- For basic facts about Kullback-Leibler divergence (a.k.a.
relative entropy), one of the best sources is Cover and Thomas's
classic textbook
*Elements of Information Theory*(sections 2.5 and 2.6, and Lemma 11.6.1). - The lower bound proof for multi-armed bandits given in class
parallels the argument given in Section 6.9 of
*Prediction, Learning, and Games*by Cesa-Bianchi and Lugosi. The original source for this proof is The non-stochastic multi-armed bandit problem, P. Auer, N. Cesa-Bianchi, Y. Freund, and R. Schapire, SIAM J. Computing 32(1):48-77. - For the Gittins index, recommended readings are On Playing Golf with Two Golf Balls, I. Dumitriu, P. Tetali, and P. Winkler, SIAM J. Discrete Math. 16 (2003), 604-615, and On the Gittins Index for Multiarmed Bandits, R. Weber, Annals of Applied Probability 2(4):1024-1033, 1992.

- The non-stochastic multi-armed bandit problem, P. Auer, N. Cesa-Bianchi, Y. Freund, and R. Schapire, SIAM J. Computing 32(1):48-77.
- Adaptive routing with end-to-end feedback: Distributed learning and geometric approaches, B. Awerbuch and R. Kleinberg, STOC 2004.
- Robbing the bandit: Less regret in online geometric optimization against an adaptive adversary, V. Dani and T. Hayes, SODA 2006.
- Chapter 6 (especially sections 6.4-6.7) of
*Prediction, Learning, and Games*(Cesa-Bianchi and Lugosi).

Suggested Reading

There is no required textbook for the class, but there are
two recommended texts:

*Prediction, Learning, and Games*, by Cesa-Bianchi and Lugosi*The Theory of Learning in Games*, by Fudenberg and Levine