Date: March 20, 2026

Title: Reconfigurable Torus Fabrics for Multi-tenant ML

Speaker: Abhishek Vijaya Kumar, Ph.D. candidate in Computer Science, Cornell Bowers

 A black and white photo of a man outdoors.

Abstract: We develop Morphlux, a server-scale programmable photonic fabric to interconnect accelerators within servers. We show that augmenting state-of-the-art torus-based ML datacenters with Morphlux can improve the bandwidth of tenant compute allocations by up to 66%, reduce compute fragmentation by up to 70%, and minimize the blast radius of accelerator failures. We develop a novel end-to-end hardware prototype of Morphlux to demonstrate these performance benefits which translate to 1.72x improvement in finetuning throughput of ML models. By rapidly programming the server-scale fabric in our hardware testbed, Morphlux can replace a failed accelerator with a healthy one in 1.2 seconds.

Bio: Abhishek is a Ph.D. candidate in Computer Science at Cornell University, advised by Rachee Singh. Abhishek works on infrastructure and systems for Machine learning in general. His work has been on communication optimization for distributed machine learning. His work covers both electrical interconnects like Nvlinks and emerging silicon photonics interconnects like this