Computational Threshold Phenomena for Average-Case Problems in Statistics, Machine Learning, and Combinatorial Optimization

STOC 2018 Workshop. June 29, 2018. Los Angeles, CA.

Speakers: Florent Krzakala, Nike Sun, Jacob Steinhardt, Sam Hopkins

Schedule (abstracts below)

1:00 -- 1:45: Florent Krzakala: Statistics-vs-information gaps for a selection of problems

1:45 -- 2:30: Nike Sun: Phase transitions in random CSPs

2:30 -- 2:40: (coffee break)

2:40 -- 3:25: Jacob Steinhardt: Complexity of learning in partially specified and semirandom models

3:25 -- 4:00: Sam Hopkins: Sum of squares proofs, bounded (almost) independence, and thresholds for planted problems

Abstracts

Florent Krzakala

Approximate message passing and information-theoretic thresholds for matrix/tensor factorization and two-layer neural network learning.

Nike Sun:

Phase transitions in random CSPs

Constraint satisfaction problems in the setting of sparse random graphs are conjectured by physicists to exhibit a range of interesting phenomena. I will describe the physics predictions, survey some recent progress in rigorous results for specific models, and mention some open problems. This talk is based on joint works with Jian Ding, Allan Sly, and Yumeng Zhang.

Jacob Steinhardt

Complexity of learning in partially specified and semirandom models

In order to talk about average-case complexity, we need to specify a probability distribution over inputs to an algorithm. Relative to worst-case complexity, this is unsatisfying, as most given problems of interest are unlikely to exactly match the specified distribution, and many learning algorithms are not robust to violations of their assumptions. At the same time, reductions for average-case hardness often fail to preserve distributional structure.

Partial specification and semi-random models are two tools for studying average-case complexity in a more robust manner. In both cases, distributional assumptions are made about some aspect of a problem, while the remaining aspects are allowed to be worst-case. This creates an intriguing middle ground that on the one hand encourages the design of more robust algorithms, and on the other hand opens up the possibility of more satisfying complexity-theoretic reductions for proving hardness.

In this talk I will survey instances of partial specification and semi-randomness in robust estimation, mixture modeling, and graph clustering. The focus will mostly be on open questions, but we will also provide a few answers---for instance, we will see that the sum-of-squares hierarchy has appealing average-case properties on any distribution satisfying an isoperimetric inequality, and we will also draw links between the Kesten-Stigum threshold and information-theoretic hardness in semi-random graph clustering.

Sam Hopkins

Sum of squares proofs, bounded (almost) independence, and thresholds for planted problems

There has been a great deal of recent interest in using the Sum of Squares method to understand computationally-challenging statistical inference problems. I will briefly survey how the method has been used. Then, I will focus on an intriguing question suggested by recent Sum of Squares lower bounds for problems such as planted clique and sparse principal component analysis: does polylogarithmic-wise independence fool polynomial-time algorithms? I will describe some recent evidence for such a conjecture (specialized to polynomial-time algorithms from the Sum of Squares family), from joint work with Kothari, Potechen, Raghavendra, Schramm, and Steurer.

Organizers: Sam Hopkins, Andrea Montanari, David Steurer

Questions? Send an email to Sam, via samhop at cs dot cornell dot edu.