Fast Decision Tree Learning Solves Hard Coding-Theoretic Problems (via Zoom)
Abstract: We connect the problem of properly PAC learning decision trees to the parameterized Nearest Codeword Problem (k-NCP). Despite significant effort by the respective communities, algorithmic progress on both problems have been stuck: the fastest known algorithm for the former runs in quasipolynomial time (Ehrenfeucht and Haussler 1989) and the best known approximation ratio for the latter is O(n/log n) (Berman and Karpinsky 2002; Alon, Panigrahy, and Yekhanin 2009). Research on both problems has thus far proceeded independently with no known connections. 
We show that any improvement of Ehrenfeucht and Haussler’s algorithm will yield O(log n)-approximation algorithms for k-NCP, an exponential improvement of the current state of the art. This can be interpreted either as a new avenue for designing algorithms for k-NCP, or as one for establishing the optimality of Ehrenfeucht and Haussler’s algorithm. Furthermore, our reduction along with existing constant-factor inapproximability results for k-NCP (Maurangsi 2020; Li, Lin, Liu 2024) already rule out polynomial-time algorithms for properly learning decision trees. A notable aspect of our reduction is that it holds even in the setting of weak learning, whereas prior results were limited to the setting of strong learning.
Joint work with Carmen Strassle and Li-Yang Tan
Bio: Caleb is a 4th year PhD student at Stanford advised by Li-Yang Tan. He is broadly interested in problems at the intersection of complexity theory and learning theory.