I will join UMass Amherst CICS as an Assistant Professor in Fall'24. This year I am spending at Google Research with Peter Kairouz. Recently, I completed my PhD at Cornell Tech advised by Vitaly Shmatikov and Deborah Estrin. My research was recognized by Apple Scholars in AI/ML and Digital Life Initiative fellowships.
My work focuses on security and privacy in emerging AI-based systems under real-life conditions and attacks. During my PhD I worked on backdoor attacks in federated learning and fairness trade-offs with differential privacy. We also proposed new frameworks Backdoors101 and Mithridates to advance research on backdoors and a new attack on generative language models covered by VentureBeat and The Economist. Recently, I have studied vulnerabilities in multi-modal systems: instruction injections and adversarial illusions.
A big focus of my work is data privacy. I worked on Ancile – a framework for language-level control over data usage. At Google, I designed a new algorithm for building private heatmaps. At Apple, I developed a novel way to obtain good tokenizers for private federated learning. Before starting my PhD, I received an engineering degree from Baumanka and worked at Cisco as a software engineer.
I grew up in Tashkent. In my free time I play water polo and spend time with family.
For prospective students: please consider applying to UMass CS PhD. I am looking for students for Fall 2024, so reach out, but provide some concrete ideas on security and privacy that you are interested in.Tokenization is an important part of training a good language model, however in private federated learning where user data are not available generic tokenization methods reduce performance. We show how to obtain a good tokenizer without spending additional privacy budget.
Work done at Apple. Best paper runner-up award. [PDF]We introduce a constrain-and-scale attack, a form of data poisoning, that can stealthily inject a backdoor into one of the participating models during a single round of Federated Learning training. This attack can avoid proposed defenses and propagate the backdoor to a global server that will distribute the compromised model to other participants.
[PDF] [Code]This project discusses a new trade off between privacy and fairness. We observe that training a Machine Learning model with Differential Privacy reduces accuracy on underrepresented groups.
[NeurIPS, 2019], [Code].A fast and compact cloud-native implementation of containers.
[PDF].