A Learning Theory Approach to Non-Interactive Database Privacy
Katrina Ligett
Carnegie Mellon University, Department of Computer Science
Abstract:
Organizations frequently have databases of sensitive information that
they would like to release to researchers or to the public, but legal or
moral privacy constraints may prevent them from doing so. In this work
we examine when it is possible to release information that maintains a
large number of properties of the original data, while maintaining a
formal definition of privacy ("differential privacy").
We demonstrate that, ignoring computational constraints, it is possible
to release privacy-preserving databases that are useful for all queries
over a discretized domain from any concept class with polynomial
VC-dimension. We show a new lower bound for releasing databases that
are useful for halfspace queries over a continuous domain. Despite
this, we give a privacy-preserving polynomial-time algorithm that
releases information useful for all halfspace queries, for a slightly
relaxed definition of usefulness.
These results highlight a number of interesting open questions for
efficient non-interactive data privacy. If time permits, I may also mention
some related work in progress.
Joint work with Avrim Blum and Aaron Roth.