Speaker: Michael Kearns
Affiliation: AT&T Labs
Date: 11/4/99 - Thursday
Time & Location: 4:15PM, 101 Phillips
Title: Sparse Sampling Methods for Learning and Planning
In the artificial intelligence community, Markov decision processes (MDPs) have become a popular model for planning and learning how to act in uncertain environments. While the basic algorithms and their convergence properties have been known for some time, a number of challenges remain, including the problems of exploration vs. exploitation, large state spaces, and partial observability.
In this talk, I will survey a series of recent results that apply methods of sparse sampling and uniform convergence to these problems.
The results include:
* A polynomial-time algorithm for the exploration-exploitation problem in MDPs that incorporates the mixing time of the optimal solution;
* The first finite-sample convergence rates for Q-learning;
* An on-line algorithm for near-optimal planning in MDPs whose per-step complexity has no dependence on the number of states;
* Algorithms for choosing the best strategy from a given class on the basis of only a small number of trajectories in a partially observable MDP.
Our methods highlight the strong connections between problems in standard supervised learning and reinforcement learning.
This talk describes joint work with Satinder Singh, Yishay Mansour, Andrew Ng, and Daphne Koller.