CS Professor and CTC Associate Director Johannes Gehrke was invited to give two keynotes at fall conferences this year. He offered the keynote at the 2005 International Conference on Machine Learning in August ( http://icml.ais.fraunhofer.de/home.php ) as well as the keynote at the SAS Data Mining Technology Conference ( http://www.sas.com/events/dmconf/ ) in October.
In his presentation, Gehrke indicated that the digitization of our daily lives has led to an explosion in the collection of data by governments, corporations, and individuals. Protection of confidentiality of this data is of utmost importance. However, knowledge of statistical properties of private data can have significant societal benefit, for example, in decisions about the allocation of public funds based on Census data, or in the analysis of medical data from different hospitals to understand the interaction of drugs.
Gehrke started his talk by introducing two application scenarios, privacy-preserving data analysis and privacy-preserving data publishing. He showed how, in simple models, background knowledge can lead to severe breaches of privacy in both applications, and he described how proper modeling of background knowledge can avoid privacy breaches. He outlined first algorithmic steps toward privacy-preserving data analysis and data publishing with background knowledge, and he concluded with open problems.
Gehrke's talk surveyed recent research that builds bridges between the two seemingly conflicting goals of sharing data while preserving data privacy and confidentiality. The presentation covered definitions of privacy and disclosure, and associated methods how to enforce them.
Gehrke also gave this talk as an invited tutorial at PODS 2005 (the 24th ACM SIGMOD -SIGACT-SIGART Symposium on Principles of Database Systems) in June 2005 on "Models and Methods for Privacy-Preserving Data Publishing and Analysis." For more information on this conference, visit http://cimic.rutgers.edu/sigmodpods05/.