- About
- Events
- Calendar
- Graduation Information
- Cornell Learning Machines Seminar
- Student Colloquium
- BOOM
- Spring 2025 Colloquium
- Conway-Walker Lecture Series
- Salton 2024 Lecture Series
- Seminars / Lectures
- Big Red Hacks
- Cornell University / Cornell Tech - High School Programming Workshop and Contest 2025
- Game Design Initiative
- CSMore: The Rising Sophomore Summer Program in Computer Science
- Explore CS Research
- ACSU Research Night
- Cornell Junior Theorists' Workshop 2024
- People
- Courses
- Research
- Undergraduate
- M Eng
- MS
- PhD
- Admissions
- Current Students
- Computer Science Graduate Office Hours
- Advising Guide for Research Students
- Business Card Policy
- Cornell Tech
- Curricular Practical Training
- A & B Exam Scheduling Guidelines
- Fellowship Opportunities
- Field of Computer Science Ph.D. Student Handbook
- Graduate TA Handbook
- Field A Exam Summary Form
- Graduate School Forms
- Instructor / TA Application
- Ph.D. Requirements
- Ph.D. Student Financial Support
- Special Committee Selection
- Travel Funding Opportunities
- Travel Reimbursement Guide
- The Outside Minor Requirement
- Robotics Ph. D. prgram
- Diversity and Inclusion
- Graduation Information
- CS Graduate Minor
- Outreach Opportunities
- Parental Accommodation Policy
- Special Masters
- Student Spotlights
- Contact PhD Office
Abstract: Recent successes of deep neural networks in a large number of domains have spurred a renewed interest in both theory and applications of these models. However training and inference in such models at massive scale still remains extremely challenging. In this talk, I will highlight a number of challenges related to both speed and quality in problems containing billions of outputs that drive real-world relevance search and recommendation systems. I will describe advancements in fast matrix-vector products via structured matrices, provably convergent adaptive non-convex optimization, and design of appropriate loss functions, making robust massive scale learning feasible.
Bio: Sanjiv Kumar is a Distinguished Scientist at Google Research, NY where he is currently leading research in theory and applications of large scale machine learning. His research interests include massive scale deep learning, fast training and inference in large output spaces, distributed and privacy preserving learning, and data-dependent hashing. His work on convergence properties of Adam received the best paper award in ICLR 2018. He had been an adjunct faculty at Columbia University where he taught a new course on large-scale machine learning. He is currently serving as an Action Editor of JMLR. Sanjiv holds a PhD from the School of Computer Science, Carnegie Mellon University.