Symmetric distribution properties such as support size, support coverage, entropy, and proximity to uniformity, arise in many applications. Specialized estimators and analysis tools were recently used to derive sample-optimal estimators for each of these properties. We show that a single, simple, plug-in estimator—profile maximum likelihood (PML)—is sample competitive for all symmetric properties, and in particular is asymptotically sample-optimal for all the properties above.
Our technical results include:
- A bound on the performance of general Maximum Likelihood Estimation as a function of the underlying domain size.
- Improved estimators for various symmetric properties with sharp phase transitions in the error probability.
Our results on symmetric properties follow from combining the above two results with Hardy-Ramanujan's bounds on partition numbers.
We will conclude with a number of open directions, both computational and statistical!
Joint work with Hirakendu Das, Alon Orlitsky, and Ananda Theertha Suresh.