- About
- Events
- Calendar
- Graduation Information
- Cornell Tech Colloquium
- Student Colloquium
- BOOM
- Fall 2023 Colloquium
- Conway-Walker Lecture Series
- Salton 2023 Lecture Series
- Seminars / Lectures
- Big Red Hacks
- Cornell University High School Programming Contests 2023
- Game Design Initiative
- CSMore: The Rising Sophomore Summer Program in Computer Science
- Explore CS Research
- ACSU Research Night
- Cornell Junior Theorists' Workshop
- People
- Courses
- Research
- Undergraduate
- M Eng
- MS
- PhD
- Admissions
- Current Students
- Computer Science Graduate Office Hours
- Business Card Policy
- Cornell Tech
- Curricular Practical Training
- Exam Scheduling Guidelines
- Fellowship Opportunities
- Field of Computer Science Ph.D. Student Handbook
- Graduate TA Handbook
- Field A Exam Summary Form
- Graduate School Forms
- Instructor / TA Application
- Ph.D. Requirements
- Ph.D. Student Financial Support
- Special Committee Selection
- Travel Funding Opportunities
- The Outside Minor Requirement
- Diversity and Inclusion
- Graduation Information
- CS Graduate Minor
- Outreach Opportunities
- Parental Accommodation Policy
- Special Masters
- Student Spotlights
- Contact PhD Office
Efficient Learning of Latent Variable Models
Abstract: We formulate a new geometric problem - of identifying a simplex K given data points obtained by perturbing latent points in K. It is a common generalization of several latent variable models including Topic Modeling (LDA), Mixed Membership Block models (MMBM), non-negative matrix factorization and others. We show that K is well approximated by a data-determined polytope K which admits a polynomial time optimization oracle. We then prove that successive vertices of K can be found by optimizing carefully chosen linear functions over K. The algorithm is simply stated and has running time matching the best previous algorithms. The proof of correctness involves new and existing tools from Numerical Analysis, Random Projections and convex geometry. In the last part of the talk, we describe how related techniques can be used to derive near-optimal sample complexity for LDA and MMBM.
Joint Work with Chiranjib Bhattacharyya.