Date: October 17, 2025

Title: PAPAYA Federated Analytics Stack: Engineering Privacy, Scalability and Practicality

Speaker: Salman Abid, Ph.D. student, Cornell Bowers

A color photo of a man with glasses wearing a scarf.

Abstract: Cross-device Federated Analytics (FA) is a distributed computation paradigm designed to answer analytics queries about and derive insights from data held locally on users’ devices. On-device computations combined with other privacy and security measures ensure that only minimal data is transmitted off-device, achieving a high standard of data protection. Despite FA’s broad adoption, the applicability of existing FA systems is limited by compromised accuracy; lack of flexibility for data analytics; and an inability to scale effectively. In this paper, we describe our approach to combine privacy, scalability, and practicality to build a system that overcomes these limitations. The PAPAYA system at Meta system leverages trusted execution environments (TEEs) and optimizes the use of on-device computing resources to facilitate federated data processing across large fleets of devices, while ensuring robust, defensible, and verifiable privacy safeguards. We focus on federated analytics (statistics and monitoring), in contrast to systems for federated learning (ML workloads), and we flag the key differences.

Bio: Salman is a second-year Ph.D. student working with Hakim Weatherspoon in Systems. His current research is on Distributed Systems for Digital Agriculture, building a secure framework to use ML on the farm.