- About
- Events
- Calendar
- Graduation Information
- Cornell Learning Machines Seminar
- Student Colloquium
- BOOM
- Spring 2025 Colloquium
- Conway-Walker Lecture Series
- Salton 2024 Lecture Series
- Seminars / Lectures
- Big Red Hacks
- Cornell University / Cornell Tech - High School Programming Workshop and Contest 2025
- Game Design Initiative
- CSMore: The Rising Sophomore Summer Program in Computer Science
- Explore CS Research
- ACSU Research Night
- Cornell Junior Theorists' Workshop 2024
- People
- Courses
- Research
- Undergraduate
- M Eng
- MS
- PhD
- Admissions
- Current Students
- Computer Science Graduate Office Hours
- Advising Guide for Research Students
- Business Card Policy
- Cornell Tech
- Curricular Practical Training
- A & B Exam Scheduling Guidelines
- Fellowship Opportunities
- Field of Computer Science Ph.D. Student Handbook
- Graduate TA Handbook
- Field A Exam Summary Form
- Graduate School Forms
- Instructor / TA Application
- Ph.D. Requirements
- Ph.D. Student Financial Support
- Special Committee Selection
- Travel Funding Opportunities
- Travel Reimbursement Guide
- The Outside Minor Requirement
- Robotics Ph. D. prgram
- Diversity and Inclusion
- Graduation Information
- CS Graduate Minor
- Outreach Opportunities
- Parental Accommodation Policy
- Special Masters
- Student Spotlights
- Contact PhD Office
Online Convex Optimization with Unbounded Memory
Abstract: Online convex optimization (OCO) is a widely used framework in online learning. In each round, the learner chooses a decision in some convex set and an adversary chooses a convex loss function, and then the learner suffers the loss associated with their chosen decision. However, in many of the motivating applications the loss of the learner depends not only on the current decision but on the entire history of decisions until that point. The OCO framework and existing generalizations only allow for losses depending on a bounded-length subsequence of past decisions. In this work we introduce a generalization of the OCO framework, “Online Convex Optimization with Unbounded Memory”, that captures long-term dependence on past decisions.
We introduce the notion of p-effective memory capacity, H_p, that quantifies the maximum influence of past decisions on current losses. We prove a O(\sqrt{ H_1 T}) upper bound on the policy regret. Under mild additional assumptions we prove a stronger O(\sqrt{H_p T}) upper bound. We also prove lower bounds on the policy regret for this problem. We show the broad applicability of our framework by using it to derive regret bounds, and to simplify existing regret bound derivations, for a variety of online learning problems including an online variant of performative prediction and online linear control.
This is based on joint work with Sarah Dean and Robert D. Kleinberg.
Bio: Raunak is a PhD candidate in the Computer Science department at Cornell, where he is advised by Bobby Kleinberg and Sarah Dean. He is interested in online learning, bandits, optimization, and machine learning with graphs. Previously, he received his undergraduate degree from The University of British Columbia, where he was advised by Mark Schmidt. He has also interned at Microsoft over multiple summers working closely with Jennifer Neville.