Abstract: Recent successes of deep neural networks in a large number of domains have spurred a renewed interest in both theory and applications of these models. However training and inference in such models at massive scale still remains extremely challenging. In this talk, I will highlight a number of challenges related to both speed and quality in problems containing billions of outputs that drive real-world relevance search and recommendation systems. I will describe advancements in fast matrix-vector products via structured matrices, provably convergent adaptive non-convex optimization, and design of appropriate loss functions, making robust massive scale learning feasible. 

Bio: Sanjiv Kumar is a Distinguished Scientist at Google Research, NY where he is currently leading research in theory and applications of large scale machine learning.  His research interests include massive scale deep learning, fast training and inference in large output spaces, distributed and privacy preserving learning, and data-dependent hashing.  His work on convergence properties of Adam received the best paper award in ICLR 2018. He had been an adjunct faculty at Columbia University where he taught a new course on large-scale machine learning. He is currently serving as an Action Editor of JMLR. Sanjiv holds a PhD from the School of Computer Science, Carnegie Mellon University.