Department of Computer Science
Cornell University

## Probabilistic Assertions

June 23, 2014

At PLDI 2014 this month, I presented our recent work on probabilistic assertions, a new mechanism for reasoning about programs that behave randomly. I’m interested in this kind of program from the perspective of approximate computing, my research bread and butter, but the idea has applications for a wide variety of important software.

Say, for example, you need to publish some aggregate statistics for a large group of users without requiring them to divulge any sensitive secrets individually. For example, you might want the average salary for employees in a given field who don’t want to reveal their exact salary to you.

One sound strategy is differential privacy: you can ask each employee to add some random noise to their salary before forking it over. This way, no individual person has to divulge their sensitive information.

But you still want the average salary to be useful. You might write, for example, that the obfuscated average is within \$10,000 of the true average:

true_avg = average(salaries)
private_avg = average(obfuscate(salaries))
assert true_avg - private_avg <= 10000

That assert seems like a good idea, since it expresses a usefulness property you want to test or verify. But there’s one tiny problem: the assertion is false in any correct implementation of the algorithm. Differential privacy requires that you take the risk of adding very large noise—since this is statistically unlikely, the data is still useful with high probability, but it must be allowed to fail sometimes.

In this situation, what you need is a kind of assertion that is sensitive to probabilities. We propose a probabilistic assertion, which lets you specify a probability and a confidence level for verification. You write:

passert e, p, c

to indicate that you want the expression e to be true with at least probability p, with verification at the confidence level c.

Our PLDI 2014 paper, Expressing and Verifying Probabilistic Assertions, shows how to efficiently and accurately check this kind of assertion. The paper owes everything to my incredible collaborators: Pavel Panchekha here at UW, Todd Mytkowicz and Kathryn McKinley at MSR, and my advisors, Dan Grossman and Luis Ceze.

I’ve got a talking-head video of the PLDI talk below. You can also check out the paper and the slides.