Kilian Q. Weinberger - Cornell University

Professor of Computer Science

Cornell University & ASAPP Inc.

ACM Fellow | AAAI Fellow | Jolly Good Fellow

ML Lectures

YouTube lecture series

Research Philosophy

NeurIPS workshop talk

Google Scholar

Publication profile

DBLP

Publication database

📅

Office Hours

Book an appointment

🎓

PhD Applications

Apply to join the lab

Biography

Kilian Q. Weinberger is a Professor in the Department of Computer Science at Cornell University. He received his Ph.D. from the University of Pennsylvania in Machine Learning under the supervision of Lawrence Saul and his undergraduate degree in Mathematics and Computing from the University of Oxford.

During his career he has won several best paper awards at ICML (2004), CVPR (2004, 2017), AISTATS (2005) and KDD (2014, runner-up award). In 2011 he was awarded the Outstanding AAAI Senior Program Chair Award and in 2012 he received an NSF CAREER award. He is the recipient of the Daniel M Lazar '29 Excellence in Teaching Award (2016) and the Ann S. Bowers Teaching and Advising Excellence Award (2024).

As of 2024 he is an ACM and AAAI fellow and in 2021 became a Blavatnik National Awards Finalists. He was elected co-Program Chair for ICML 2016 and for AAAI 2018 and has been on the ICML board since 2016. He served as the 7th president of ICML from 2023 until 2025. Since 2024 he has been a member of the Sloan Research Fellowships Selection Committee.

Kilian Weinberger's research focuses on Machine Learning and its applications. In particular, he has worked on learning under resource constraints, metric learning, AI in Science, computer vision, autonomous vehicles, Gaussian Processes, and deep learning. Before joining Cornell University, he was an Associate Professor at Washington University in St. Louis and before that he worked as a research scientist at Yahoo! Research in Santa Clara.

About Me

I am married to (the amazing) Anne Bracy. Together we have three children, Timo, Koby, and Nika. In my spare time I like running (when I am not injured), reading, biking, or boating on Cayuga Lake. Some books that I really enjoyed are The Three Body Problem Trilogy, A Gentleman in Moscow, The Things They Carried, and Never Let Me Go. In 2025 I particularly enjoyed reading Source Code (by Bill Gates).

My Lab

I have been lucky to work with incredibly talented and fun (🎓-Phd, 📖-Masters, 📓-Undergraduate) students and Postdocs (🔬).

Current Lab Members

Anmol Kabra 🤹🎓

Sofian Zalouk 🎓

(advised by Chris de Sa)

Linxi Zhao 🎓

(co-advised with Jen J. Sun)

Dongyoung Go 🔬

Paul Jünger 🎓

Joia Zhang 🎓

Kamilė Stankevičiūtė 🤹🎓

(advised by Ferenc Huszár)

Albert Gong 🎓

(co-advised with Raaz Dwivedi)

Chao Wan 🎓

Zhenzhen Liu 🎓

Jin Zhou 🎓

Justin Lovelace 🤹🎓

Christian Belardi 🤹🎓

(co-advised with Carla Gomes)

Kimia Kazemia 🎓

(advised by Sarah Dean)

Saebyeol Shin 🎓

Mihir Mishra 📖

Former Lab Members

Raphael Thesmar 📓

Claas Beger 📓

(Santa Fe Institute)

Carl-Leander Henniken 📓

(Epiq AI)

Johann Lee 📓

(Adobe)

Anissa Dallman 🤹📖

(John's Hopkins)

Travis Zhang 🤹📖

(TikTok)

Varsha Kishore 🤹🎓

(AI2)

Xiangyu Chen 🤹🎓

(Waymo)

Katie Luo 🤹🎓

(Stanford)

Yurong You 🤹🎓

(NVIDIA)

Ruihan Wu 🎓

(OpenAI)

Yan Wang 🤹🎓

(NVIDIA)

Boyi Li 🎓

(Berkeley/NVIDIA)

(co-advised with Serge Belongie)

Geoff Pleiss 🤹🎓

(UBC)

Chuan Guo 🎓

(OpenAI)

(co-advised with Karthik Sridharan)

Johan Björck 🤹🎓

(NVIDIA)

(advised by Carla Gomes)

Tianyi Zhang 🤹📓

(Stanford)

Felix Wu 🤹🎓

(DeepMind)

Yuqian Zhang 🔬

(Rutgers)

Amauri H. Souza 🔬

(Federal Institute of Ceara)

Wei-Lun (Harry) Chao 🔬

(Boston University)

Chris Fifty 📓

(Stanford)

Ransen Niu 📖

(Apple)

Sharon (Yixuan) Li 🎓

(U. Wisconsin)

(advised by John Hopcroft)

Gao Huang 🤹🔬

(Tsinghua)

Yanyi Liu 🤹📓

(OpenAI)

Luyu Yang 📓

(UMD)

Lili Meng 🔬

(Xtract)

Danlu Chen 📓

(UCSD)

Shichen Liu 📓

(Google)

Tianhong Li 📓

(MIT)

Jake Gardner 🤹🎓

(UPenn)

Quan Zhou 🤹🎓

Yu Sun 🤹📓

(Stanford)

Nicholas Kolkin 📓

(Adobe)

Gabriel Hope 📓

(Harvey Mudd)

Daniel Sedra 📖

(Pinterest)

Wenlin Wang 🤹📖

(Meta)

Shuang Li 📓

Zhuang Liu 🤹📓

(Princeton)

Eddie Xu 🤹🎓

(Kernel Trading)

Stephen Tyree 🤹🎓

(NVIDIA)

Dor Kedem 🤹🎓

(ING)

Wenlin Chen 🤹🎓

(Meta)

(advised by Yixin Chen)

Matthew Kusner 🤹🎓

(École Polytechnique de Montréal)

Yuzong Liu 🤹🎓

(Amazon)

Ananth Mohan 📓

(Amazon)

Minmin Chen 🎓

(DeepMind)

I am looking for good PhD students most of the time. The most important prerequisite to be successful in machine learning is a strong mathematical background and coding skills.

If you are interested, please do not apply to me directly, as all applications are centralized through the department. Please indicate on your application that you are interested in working on machine learning and that you are interested in joining my group. As we receive thousands of applicants, I sometimes apply a filter and first look at applications that mention my name.

All students who are accepted will obtain a fully funded fellowship that covers tuition, 12 months salary and health insurance. (Please don't send me any emails with questions about the application process, as I am not involved in it.)

Application Information

Research Highlights

A few years ago I summarized my research philosophy for a Neurips Workshop talk. My research focuses on algorithm design for machine learning with a specific emphasis on representation learning. Over the years my work has spanned several interconnected research directions, from fundamental questions about how to compare data to practical applications in autonomous driving.

Metric Learning

One of the fundamental challenges of machine learning is how to compare examples. My early work introduced Large Margin Nearest Neighbor (LMNN), which learns distance metrics by pulling similarly labeled inputs close while pushing dissimilarly labeled inputs apart. This framework popularized the triplet loss objective that is now widely used in computer vision.

When word embeddings became a thing, we introduced the Word Mover's Distance (WMD), a novel approach to measuring similarity between text documents. WMD elegantly incorporates the fact that different words can have similar meanings by casting document comparison as a transportation problem over word embeddings. We later extended this idea to contextual embeddings with BertScore, which has been widely adopted for evaluating machine translation and text generation systems.

Resource Efficient Learning

In industrial applications, all resources are limited and must be accounted for. My group has been among the first to formally integrate resource constraints into learning algorithms, treating feature extraction cost as a natural trade-off with accuracy. This work introduced feature hashing, now widely known as the "hashing trick," which allows learning tasks to operate within fixed memory budgets.

During the rise of deep learning, we extended these ideas to neural networks, showing that network parameters are highly redundant and can be compressed by orders of magnitude without significant accuracy loss. This contributed to the now-vibrant subfield of neural network compression and the ongoing discussion about over-parameterization.

Deep Network Architectures

Our work on network compression revealed surprising redundancy in deep networks, raising questions about whether this redundancy is necessary or avoidable. We introduced stochastic depth, showing that deliberately increasing redundancy can substantially improve generalization.

To understand the purpose of this redundancy, we designed DenseNet, which introduces direct skip connections between all layers of the same size. This architecture drastically improves both generalization performance and parameter efficiency, winning the CVPR 2017 best paper award and establishing itself as one of the most widely used neural network architectures.

Recognizing that neural networks are increasingly used in high-stakes medical decisions, we investigated why network probability outputs are poorly calibrated. Our work on calibration showed that temperature scaling is highly effective for reliable probability estimates, and this approach has become the standard method for network calibration.

More recently, we studied graph convolutional neural networks and discovered that much of their complexity was unnecessary. Our Simplifying Graph Convolutional Networks paper showed that a simple closed-form preprocessing step paired with logistic regression can match the performance of complex GCNs while being orders of magnitude faster.

Efficient Inference for Gaussian Processes

To make Gaussian Processes as accessible as deep learning, my students, Andrew G. Wilson, and I (but really, mostly Geoff Pleiss and Jake Gardner) developed GPyTorch, a highly modular library that leverages GPU-optimized matrix operations. This platform has become one of the most popular GP coding frameworks, with contributors from universities and companies worldwide.

Perception for Autonomous Driving

In collaboration with colleagues in Mechanical Engineering and Computer Science, we investigated whether 3D object detection for self-driving cars could be performed with passive cameras instead of expensive LiDAR sensors. Our pseudo-LiDAR approach mimics a LiDAR-like point cloud using stereo camera data, dramatically improving detection accuracy.

Youtube video about my research philosophy.

Teaching

One of my favorite parts of my job is teaching. I have mostly taught Machine Learning, Deep Learning, and AI. Some of my lectures on Machine Learning are available on Youtube.

At Washington University in St. Louis

CSE517a - Machine Learning

Spring 2015, Spring 2014, Spring 2010

CSE519T - Advanced Machine Learning

Fall 2014, Fall 2012

CSE 511a - Artificial Intelligence

Fall 2013, Spring 2012, Fall 2010

Youtube recordings of CS4780 Intro to ML.

Painting by Katherine Voor, 2018

Contact

Office

Professor of Computer Science
Cornell University
Bowers, Room 475
Ithaca, NY 14853-7501

Email

kilian () cornell.edu

Phone

(607) 255 4845

Office Hours

Book Office Hours

Professor of Computer Science

Cornell University & ASAPP Inc.

ML Lectures

Research Philosophy

Google Scholar

DBLP

Office Hours

PhD Applications

Biography

About Me

My Lab

Current Lab Members

Former Lab Members

Research Highlights

Metric Learning

Resource Efficient Learning

Deep Network Architectures

Efficient Inference for Gaussian Processes

Perception for Autonomous Driving

Teaching

At Cornell University

CS4782 - Introduction to Deep Learning

CS6784 - Advanced Topics in Machine Learning

CS3780/4780/5780 - Machine Learning

At Washington University in St. Louis

CSE517a - Machine Learning

CSE519T - Advanced Machine Learning

CSE 511a - Artificial Intelligence

Contact

Office

Email

Phone

Office Hours