Overview

Since 2023, Cornell CS has hosted an annual invited gathering of the world's strongest junior theoretical computer scientists.

[Register Here]

Speakers


Guy Blanc (Stanford)	Uma Girish (Princeton)
Max Hopkins (UCSD)	Rahul Ilango (MIT)
Bingbin Liu (CMU)	Peter Manohar (CMU)

Participation

Registration is free of charge. Please register here. All talks will be broadcast over Zoom.

Schedule

Thursday, May 9

Time	Event	Location
10am	Breakfast	Outside Gates G01
10:30am – 11:30am	Talk: Max Hopkins	Gates G01
11:30am	Break	Gates G01
11:45am – 12:45pm	Talk: Peter Manohar	Gates G01
1pm – 2pm	Group Lunch	Gates 122
2:15pm – 3:15pm	Talk: Uma Girish	Gates G01
3:30pm – 4:45pm	Social	Gates Plaza

Friday, May 10

Time	Event	Location
10am	Breakfast	Outside Gates G01
10:30am – 11:30am	Talk: Rahul Ilango	Gates G01
11:30am	Break	Gates G01
11:45am – 12:45pm	Talk: Bingbin Liu	Gates G01
1pm – 2pm	Lunch with students	Collegetown
2:15pm – 3:15pm	Talk: Guy Blanc	Gates G01
3:30pm – 5pm	One on ones w/faculty	Faculty Offices and G15 / G203

Location

Talk Details

GUY BLANC

Robust data analysis and the magic of subsampling

Abstract: Classic models of data analysis assume the data is drawn independently from the distribution of interest, but the real world is rarely so kind. Therefore, we desire robust algorithms to deviations from these classic assumptions and have correspondingly developed many models for how an algorithm can be "robust." This leads to a practical challenge - with many distinct models of robustness, it's difficult for the practitioner to determine which is most appropriate for their data. We make headway towards this challenge in two ways. First, we show that one technique, that of subsampling, leads to robustness in two very distinct settings. This suggests that even if the analyst does not exactly know which model is most appropriate for their setting, they should try subsampling. Second, we show that many distinct models of robustness are equivalent. This alleviates the challenge of deciding which model of robustness is most appropriate by greatly reducing the number of truly unique models. Based on joint works with Jane Lange, Ali Malik, Li-Yang Tan, and Greg Valiant.

UMA GIRISH

Quantum versus Classical Communication and Query Complexity

Abstract: Understanding the relative power of quantum versus classical computation is a fascinating and important problem in modern science on which there is likely to be much research and progress in the next few decades. While quantum algorithms are widely believed to exponentially outperform classical algorithms, there are very few settings in which this has been unconditionally proved. Communication complexity and query complexity are striking examples of such settings, where we can mathematically prove that quantum algorithms exponentially outperform classical algorithms. In this talk we will study the advantages of quantum over classical in these settings and present new and improved quantum speedups over classical. Among several results, we achieve state-of-the-art quantum speedups in communication complexity, using quantum protocols that are efficiently implementable, and using noisy quantum protocols where all qubits except for one are maximally noisy. We also study the power of entanglement and demonstrate in various ways that a slight increase in the entanglement can provide exponential savings in the communication complexity. We also study the power of adaptive queries in quantum query complexity and show that each successive round of quantum queries can yield super-exponential speedups in the number of quantum queries.

MAX HOPKINS

Chernoff Bounds on HDX and their Applications

Abstract: Recent years have seen the emergence of high dimensional expanders (HDX) as a powerful tool in theoretical computer science, with breakthrough applications in approximate sampling and property testing. A central force behind the success of HDX in application is their concentration of measure. Consider the following basic question: given a k-uniform hypergraph X and a subset A of its vertices, how likely is a random hyperedge to `see’ A in approximately the right proportion? When X is the complete hypergraph, this question was classically resolved by Chernoff and Hoeffding, who showed X has a subgaussian tail. In 2017, Dinur and Kaufman introduced spectral HDX and proved they satisfy Chebyshev-type (inverse-polynomial) concentration not only for this basic setting, but critically for the more general case when A sits on i-faces of the complex X. This led to a breakthrough line of work constructing sparse agreement testers among other applications, but left an exponential gap with known bounds for the complete hypergraph.

In this talk, we will revisit the basic notion of high dimensional expanders and prove they satisfy optimal concentration of measure, matching Chernoff-Hoeffding for the complete hypergraph. Leveraging this fact, we will further prove HDX are reverse hypercontractive, a powerful analytic inequality from boolean function analysis used in the construction of (combinatorial) PCPs with optimal soundness. In the second half of the talk, we overview several applications of concentration and reverse hypercontractivity on HDX, including new agreement tests in the list-decoding regime, the first explicit bounded-degree hypergraphs with optimal geometric overlap, and new degree lower bounds for HDX.

Based on joint work with Yotam Dikstein.

RAHUL ILANGO

Tales of Obfuscation: Spaghetti Code Meets Derandomization, Differential Privacy, Logic, and Metacomplexity

Abstract: A recent breakthrough in cryptography is the construction of "program obfuscators," which "scramble" code so that the behavior of an algorithm is unintelligible in some formal sense. In this talk, I will discuss this notion and overview three recent results that use obfuscation to answer open questions in various areas of theoretical computer science, including mathematical logic, derandomization, average-case complexity, and differential privacy. This talk will focus on giving a brief description of these results and connections, emphasizing intuition and ideas. No background in cryptography or complexity is required. This talk is based on joint works with Badih Ghazi, Yizhi Huang, Pritish Kamath, Ravi Kumar, Jiatu Li, Pasin Manurangsi, Hanlin Ren, and Ryan Williams.

BINGBIN LIU

Guiding machine learning design with insights from simple testbeds

Abstract: Machine learning systems are becoming increasingly powerful and complex, which poses challenges for performing diagnostics and making targeted improvements. In this talk, we will see how simple testbeds amenable to theoretical analyses can yield insights to guide practical advancements.

We will look at two language model-inspired setups. The first is masked prediction, inspired by the BERT model, where we investigate the impact of the masking rate. We study this through a parameter identifiability lens, assuming data from a hidden Markov model and revealing connection to the uniqueness of tensor rank decompositions. The second is algorithmic reasoning, which, despite being sequential in nature, has been successfully modeled by parallel architectures such as Transformers. By representing these reasoning tasks with finite-state automata and using tools from Krohn-Rhodes theory and formal languages, we show how Transformers can simulate reasoning using far fewer layers than the number of sequential reasoning steps, and discuss the gap between these theoretical constructions and practical considerations such as interpretability and generalization.

PETER MANOHAR

New Spectral Techniques in Algorithms and Coding Theory: the Kikuchi Matrix Method

Abstract: In this talk, I will present a recently developed method to solve combinatorial problems by (1) reducing them to establishing the unsatisfiability of systems of k-sparse linear equations mod 2 with a limited amount of randomness, and then (2) showing unsatisfiability of these equations by analyzing the spectral properties of appropriately chosen induced subgraphs of Cayley graphs on the hypercube (and related variants) called Kikuchi graphs.

We will discuss the following applications in this talk:

Designing an algorithm for refuting/solving semirandom and smoothed instances of constraint satisfaction problems (obtained by applying a small random perturbation to the literal patterns in a worst-case instance) that matches the best running-time vs constraint-density trade-offs for the special and significantly easier case of random CSPs;
Proving a cubic lower bound on the length of 3-query locally decodable codes, improving on the quadratic lower bound of Kerenidis and de Wolf (2004);
Proving an exponential lower bound on the length of linear 3-query locally correctable codes, improving on the cubic lower bound above;
Resolving Feige's conjecture from 2008, an extremal combinatorics conjecture about the girth vs. density trade-off for hypergraphs that generalizes the Moore bound for graphs.