- About
- Events
- Calendar
- Graduation Information
- Cornell Tech Colloquium
- Student Colloquium
- Student Recognition
- 2020 Celebratory Event
- BOOM
- CS Colloquium
- SoNIC Workshop
- Conway-Walker Lecture Series
- Salton Lecture Series
- Seminars / Lectures
- Big Red Hacks
- Cornell University High School Programming Contest
- Game Design Initiative
- CSMore: The Rising Sophomore Summer Program in Computer Science
- Explore CS Research
- Research Night Fall 2020
- People
- Courses
- Research
- Undergraduate
- M Eng
- MS
- PhD
- Admissions
- Current Students
- Ph.D. Requirements
- Business Card Policy
- Computer Science Graduate Office Hours
- CornellTech
- Curricular Practical Training
- Exam Scheduling Guidelines
- Fellowship Opportunities
- Field A Exam Summary Form
- Graduate School Forms
- Special Committee Selection
- The Outside Minor Requirement
- Travel Funding Opportunities
- Diversity and Inclusion
- Graduation Information
- CS Graduate Minor
- Outreach Opportunities
- Parental Accommodation Policy
- Special Masters
- Contact PhD Office
Testing Noisy Linear Equations for Sparsity
Abstract: Consider the following basic problem in sparse linear regression -- an algorithm gets labeled samples of the form (x, <w.x> + \eps) where w is an unknown n-imensional vector, x is drawn from a background distribution D and \eps is some independent noise. Given the promise that w is k-sparse, the breakthrough work of Candes, Rhomberg and Tao (2005) shows that w can be recovered with samples and time which scales as O(k log n). This should be contrasted with general linear regression where O(n) samples are information theoretically necessary.
In this talk, we look at this question from the vantage point of property testing and study the decision variant of the following question -- namely, what is the complexity of deciding if the unknown vector w is k-sparse (or at least say 0.01 far from k-sparse in \ell_2distance). We show that the decision version of the problem can be solved with samples which are independent of n as long as the background distribution D is i.i.d. and the components are not Gaussian. We further show that weakening any of the conditions in this result necessarily makes the complexity scale as log n (thus showing our results are tight).
Joint work with Xue Chen (Northwestern) and Rocco Servedio (Columbia).
Bio: Anindya De is an Assistant Professor at the University of Pennsylvania. Prior to Penn, he spent three years as an Assistant Professor at Northwestern University. He finished his PhD from UC Berkeley in 2013 advised by Luca Trevisan and was a Simons Research fellow and a postdoctoral fellow at IAS and DIMACS. Anindya is interested in complexity theory, learning theory and harmonic analysis of Boolean functions.