Abstract:
Probabilistic models remain a hugely popular class of techniques in modern machine learning, and their expressiveness has been extended by modern large-scale compute.  While exciting, these generalizations almost always come with approximations, and researchers typically ignore the fundamental influence of computational approximations.  Thus, results from modern probabilistic methods become as much about the approximation method as they are about the data and the model, undermining both the Bayesian principle and the practical utility of inference in probabilistic models.  
  
To expose this issue and to demonstrate how to do approximate inference correctly, in this talk we will focus on the Gaussian Process class of models.  I will derive a new class of GP approximations that provide consistent estimation of the combined posterior arising from both the finite number of data observed *and* the finite amount of computation expended.  The most common GP approximations map to an instance in this class, such as methods based on the Cholesky factorization, conjugate gradients, and inducing points. I will show the consequences of ignoring computational uncertainty, prove that implicitly modeling it improves generalization performance, and point to extensions of computational uncertainty beyond Gaussian Processes.

Bio:
John Cunningham is a Professor of Statistics at Columbia University and an investigator at the Zuckerman Mind Brain Behavior Institute and Center for Theoretical Neuroscience. Prior to Columbia he was a research fellow in the Machine Learning Group at Cambridge. He has a Ph.D. in electrical engineering from Stanford and an undergraduate degree in computer science from Dartmouth.  His research interests include machine learning -- in particular the combination of probabilistic models with deep learning and modern large-scale computation -- and its application to science and industry -- in particular using the tools of artificial intelligence to understand biological intelligence and other complex processes.