image

Eugene Bagdasaryan

info I am on the job market this year, reach me out: eugene@cs.cornell.edu

Upcoming and past seminar talks: Michigan CSE (Apr'23), Columbia CS (Apr'23), Boston University CDS (Apr'23), UW Allen School CSE (Mar'23), McGill CS (Mar'23), CISPA (Feb'23), UMass Manning CICS (Feb'23), UCLA Samueli CS (Jan'23).

Bio

I am a CS PhD candidate at Cornell Tech and an Apple AI/ML PhD Scholar advised by Vitaly Shmatikov and Deborah Estrin. I study security and privacy in emerging AI-based systems under real-life conditions and attacks.

My research goal is to build ethical, safe, and private machine learning systems – while keeping these systems practical and useful. Recently, we demonstrated security drawbacks of Federated Learning (AISTATS'20) and fairness implications of Differentially Private Deep Learning (NeurIPS'19). We also proposed a framework for backdoor attacks and defenses (USENIX'21) and a new attack on generative language models (S&P'22) that modifies LLMs and spins the output for Propaganda-as-a-Service.

A big focus of my work is data privacy – I study methods that enable new applications while protecting users. We proposed Ancile – a framework for language-level control over data usage. At Google, I worked on a new algorithm for building private heatmaps (PETS'22). At Apple, I developed a novel way to obtain good tokenizers for Private Federated Learning (FL4NLP@ACL'22). Before starting my PhD, I received an engineer specialist degree from Baumanka and worked at Cisco on OpenStack networking as a QA Engineer.

I grew up in Tashkent, Uzbekistan. In my free time I play water polo and spend time with family.

Research papers
  • mapTowards Sparse Federated Analytics: Location Heatmaps under Distributed Differential Privacy with Secure Aggregation PETS'22
    Eugene Bagdasaryan, Peter Kairouz, Stefan Mellem, Adrià Gascón, Kallista Bonawitz, Deborah Estrin, and Marco Gruteser

    A new algorithm for building heatmaps with local-like differential privacy.

    Work done at Google. [PDF], [Code].
  • campaignSpinning Language Models: Risks of Propaganda-as-a-Service and Countermeasures S&P'22
    Eugene Bagdasaryan and Vitaly Shmatikov

    We discover new capabilities of large language models to express attacker-chosen opinions on certain topics while performing tasks like summarization, translation, and language generation.

    [PDF], [Code].
  • spellcheckTraining a Tokenizer for Free with Private Federated Learning FL4NLP@ACL'22
    Eugene Bagdasaryan, Congzheng Song, Rogier van Dalen, Matt Seigel, and Áine Cahill

    Tokenization is an important part of training a good language model, however in private federated learning where user data are not available generic tokenization methods reduce performance. We show how to obtain a good tokenizer without spending additional privacy budget.

    Work done at Apple. Best paper runner-up award. [PDF].
  • visibility_off Blind Backdoors in Deep Learning Models USENIX'21
    Eugene Bagdasaryan and Vitaly Shmatikov

    We propose a novel attack that injects complex and semantic backdoors without access to the training data or the model and evades all known defenses.

    [PDF], [Code].
  • smartphoneHow To Backdoor Federated Learning AISTATS'20
    Eugene Bagdasaryan, Andreas Veit, Yiqing Hua, Deborah Estrin, and Vitaly Shmatikov

    We introduce a constrain-and-scale attack, a form of data poisoning, that can stealthily inject a backdoor into one of the participating models during a single round of Federated Learning training. This attack can avoid proposed defenses and propagate the backdoor to a global server that will distribute the compromised model to other participants.

    [PDF], [Code].
  • local_hospitalSalvaging Federated Learning by Local Adaptation Preprint
    Tao Yu, Eugene Bagdasaryan, and Vitaly Shmatikov

    Recovering participants' performance on their data when using federated learning with robustness and privacy techniques.

    [Paper], [Code].
  • doneAncile: Enhancing Privacy for Ubiquitous Computing with Use-Based Privacy WPES'19
    Eugene Bagdasaryan, Griffin Berlstein, Jason Waterman, Eleanor Birrell, Nate Foster, Fred B. Schneider, and Deborah Estrin

    A novel platform that enables control over application's data usage with language level policies and implementing use-based privacy.

    [PDF], [Code], [Slides].
  • faceDifferential Privacy Has Disparate Impact on Model Accuracy NeurIPS'19
    Eugene Bagdasaryan and Vitaly Shmatikov

    This project discusses a new trade off between privacy and fairness. We observe that training a Machine Learning model with Differential Privacy reduces accuracy on underrepresented groups.

    [NeurIPS, 2019], [Code].
  • memoryX-containers: Breaking down barriers to improve performance and isolation of cloud-native containers ASPLOS'19
    Zhiming Shen, Zhen Sun, Gur-Eyal Sela, Eugene Bagdasaryan, Christina Delimitrou, Robbert Van Renesse, and Hakim Weatherspoon

    A fast and compact cloud-native implementation of containers.

    [PDF].
  • extensionOpenrec: A modular framework for extensible and adaptable recommendation algorithms WSDM'18
    Longqi Yang, Eugene Bagdasaryan, Joshua Gruenstein, Cheng-Kang Hsieh, and Deborah Estrin

    An open and modular Python framework that supports extensible and adaptable research in recommender systems.

    [PDF], [Code].
Recent news
  • Apr 2023, interviewed by The Economist on our work studying language models.
  • Oct 2022, Cory Doctorow and Bruce Schneier wrote about our research on model spinning.
  • May 2022, a paper on location heatmaps was accepted to PETS'22.
  • Apr 2022, a Propaganda-as-a-Service paper accepted to S&P'22.
  • Mar 2022, a paper on tokenizers in FL accepted to FL4NLP workshop at ACL'22, got best paper runner-up award.
  • Dec 2021, our research on Propaganda-as-a-Service was covered by VentureBeat.
  • Aug 2021, our blind backdoors paper was covered on ZDNet.
  • Summer 2021, interned at Apple with Rogier van Dalen working on Private Federated Learning for Large Language Models.
  • Apr 2021, received Apple fellowship. Mentors: Rogier van Dalen and Kunal Talwar.
  • Feb 2021, our paper on blind backdoors accepted to USENIX Security'21.
  • Jan 2021, presented our work at Microsoft Research.
  • Nov 2020, open sourced our new framework for research on backdoors in deep learning.
  • Jul 2020, presented our work on local adaption for Federated Learning at Google.
  • Jun 2020, Ancile project was discussed in Cornell Chronicle.
  • Summer 2020, interned at Google Research with Marco Gruteser, Peter Kairouz, and Kaylee Bonawitz, focused on Federated Learning and Analytics.
  • Jan 2020, our attack on federated learning was accepted to AISTATS'20!
  • Nov 2019, passed A exam (pre-candidacy): "Evaluating privacy preserving techniques in machine learning."
  • Sep 2019, our paper about differential privacy impact on model fairness was accepted to NeurIPS'19.
  • Aug 2019, our work on the use-based privacy system Ancile was accepted to CCS WPES'19.
  • Aug 2019, presented at Contextual Integrity Symposium on contextual recommendation sharing.
  • June 2019, Digital Life Initiative fellow 2019-2020.
  • Summer 2018, interned at Amazon Research with Pawel Matykiewicz and Amber Roy Chowdhury.
  • Sep 2017, Bloomberg Data for Good fellow 2017.