image

Eugene Bagdasaryan

Bio

I am a CS PhD candidate at Cornell Tech and an Apple AI/ML Scholar, working on privacy and security in machine learning and advised by Vitaly Shmatikov and Deborah Estrin.

My research goal is to build ethical, safe, and private machine learning systems. In our work, we demonstrate security drawbacks of Federated Learning (AISTATS'20) and fairness implications of Differentially Private Deep Learning (NeurIPS'19). We proposed a framework for backdoor attacks and defenses (USENIX'21) a new attack (S&P'22) that modifies large language models and spins the output for Propaganda-as-a-Service.

Earlier, I worked on Ancile – a framework for language-level control over data usage, and OpenRec – a modular library for deep recommender systems. Amazon, Apple, and Google hosted me for summer internships. At Google, I worked on a new algorithm for building private heatmaps (PETS'22). At Apple, I developed a novel way to obtain good tokenizers for Private Federated Learning (FL4NLP@ACL'22). Before starting my PhD, I received engineering degree from Baumanka and worked at Cisco on OpenStack networking.

In my free time I play water polo and (used to...) travel.

E-mail: eugenealternate_emailcs.cornell.edu
Research papers
  • mapTowards Sparse Federated Analytics: Location Heatmaps under Distributed Differential Privacy with Secure Aggregation PETS'22
    Eugene Bagdasaryan, Peter Kairouz, Stefan Mellem, Adrià Gascón, Kallista Bonawitz, Deborah Estrin, and Marco Gruteser

    A new algorithm for building heatmaps with local-like differential privacy.

    Work done at Google. [PDF], [Code].
  • campaignSpinning Language Models: Risks of Propaganda-as-a-Service and Countermeasures S&P'22
    Eugene Bagdasaryan and Vitaly Shmatikov

    We discover new capabilities of large language models to express attacker-chosen opinions on certain topics while performing tasks like summarization, translation, and language generation.

    [PDF], [Code].
  • spellcheckTraining a Tokenizer for Free with Private Federated Learning FL4NLP@ACL'22
    Eugene Bagdasaryan, Congzheng Song, Rogier van Dalen, Matt Seigel, and Áine Cahill

    Tokenization is an important part of training a good language model, however in private federated learning where user data are not available generic tokenization methods reduce performance. We show how to obtain a good tokenizer without spending additional privacy budget.

    Work done at Apple. [PDF].
  • visibility_off Blind Backdoors in Deep Learning Models USENIX'21
    Eugene Bagdasaryan and Vitaly Shmatikov

    We propose a novel attack that injects complex and semantic backdoors without access to the training data or the model and evades all known defenses.

    [PDF], [Code].
  • smartphoneHow To Backdoor Federated Learning AISTATS'21
    Eugene Bagdasaryan, Andreas Veit, Yiqing Hua, Deborah Estrin, and Vitaly Shmatikov

    We introduce a constrain-and-scale attack, a form of data poisoning, that can stealthily inject a backdoor into one of the participating models during a single round of Federated Learning training. This attack can avoid proposed defenses and propagate the backdoor to a global server that will distribute the compromised model to other participants.

    [PDF], [Code].
  • local_hospitalSalvaging Federated Learning by Local Adaptation Preprint
    Tao Yu, Eugene Bagdasaryan, and Vitaly Shmatikov

    Recovering participants' performance on their data when using federated learning with robustness and privacy techniques.

    [Paper], [Code].
  • doneAncile: Enhancing Privacy for Ubiquitous Computing with Use-Based Privacy WPES'19
    Eugene Bagdasaryan, Griffin Berlstein, Jason Waterman, Eleanor Birrell, Nate Foster, Fred B. Schneider, and Deborah Estrin

    A novel platform that enables control over application's data usage with language level policies and implementing use-based privacy.

    [PDF], [Code], [Slides].
  • faceDifferential Privacy Has Disparate Impact on Model Accuracy NeurIPS'19
    Eugene Bagdasaryan and Vitaly Shmatikov

    This project discusses a new trade off between privacy and fairness. We observe that training a Machine Learning model with Differential Privacy reduces accuracy on underrepresented groups.

    [NeurIPS, 2019], [Code].
  • memoryX-containers: Breaking down barriers to improve performance and isolation of cloud-native containers ASPLOS'19
    Zhiming Shen, Zhen Sun, Gur-Eyal Sela, Eugene Bagdasaryan, Christina Delimitrou, Robbert Van Renesse, and Hakim Weatherspoon

    A fast and compact cloud-native implementation of containers.

    [PDF].
  • extensionOpenrec: A modular framework for extensible and adaptable recommendation algorithms WSDM'18
    Longqi Yang, Eugene Bagdasaryan, Joshua Gruenstein, Cheng-Kang Hsieh, and Deborah Estrin

    An open and modular Python framework that supports extensible and adaptable research in recommender systems.

    [PDF], [Code].
Recent news
  • May 2022, a paper with Google accepted to PETS'22.
  • Apr 2022, a Propaganda-as-a-Service paper accepted to S&P'22.
  • Mar 2022, a tokenization paper with Apple accepted to FL4NLP workshop at ACL'22.
  • Dec 2021, our research on Propaganda-as-a-Service was covered by VentureBeat.
  • Aug 2021, our blind backdoors paper was covered on ZDNet.
  • Summer 2021, interned at Apple with Rogier van Dalen working on Private Federated Learning for large language models.
  • Apr 2021, received Apple fellowship.
  • Feb 2021, our paper on blind backdoors accepted to USENIX Security'21.
  • Jan 2021, presented our work at Microsoft Research.
  • Nov 2020, open sourced our new framework for research on backdoors in deep learning.
  • Jul 2020, presented our work on local adaption for Federated Learning at Google.
  • Jun 2020, Ancile project was discussed in Cornell Chronicle.
  • Summer 2020, interned at Google Research with Marco Gruteser, Peter Kairouz, and Kaylee Bonawitz, focused on Federated Learning and Analytics.
  • Jan 2020, our attack on federated learning was accepted to AISTATS'20!
  • Nov 2019, passed A exam (pre-candidacy): "Evaluating privacy preserving techniques in machine learning."
  • Sep 2019, our paper about differential privacy impact on model fairness was accepted to NeurIPS'19.
  • Aug 2019, our work on the use-based privacy system Ancile was accepted to CCS WPES'19.
  • Aug 2019, presented at Contextual Integrity Symposium on contextual recommendation sharing.
  • June 2019, Digital Life Initiative fellow 2019-2020.
  • Summer 2018, interned at Amazon Research with Pawel Matykiewicz and Amber Roy Chowdhury.
  • Sep 2017, Bloomberg Data for Good fellow 2017.