A Learning Theory Approach to Non-Interactive Database Privacy



Katrina Ligett
Carnegie Mellon University, Department of Computer Science

Monday  September 1, 2008
4:00 PM, 5130 Upson Hall



Organizations frequently have databases of sensitive information that they would like to release to researchers or to the public, but legal or moral privacy constraints may prevent them from doing so. In this work we examine when it is possible to release information that maintains a large number of properties of the original data, while maintaining a formal definition of privacy ("differential privacy").

We demonstrate that, ignoring computational constraints, it is possible to release privacy-preserving databases that are useful for all queries over a discretized domain from any concept class with polynomial VC-dimension. We show a new lower bound for releasing databases that are useful for halfspace queries over a continuous domain. Despite this, we give a privacy-preserving polynomial-time algorithm that releases information useful for all halfspace queries, for a slightly relaxed definition of usefulness.

These results highlight a number of interesting open questions for efficient non-interactive data privacy. If time permits, I may also mention some related work in progress.

Joint work with Avrim Blum and Aaron Roth.