Volodymyr Kuleshov

Joan Eliasoph, M.D. Assistant Professor
Deparment of Computer Science
Cornell Tech and Cornell University

My research focuses on machine learning and its applications in science, health, and sustainability. It involves two high-level directions:

Core research in machine learning, specifically: generative models, probabilistic methods, diffusion-based language models, decision-making under uncertainty. ICML18 NeurIPS24 ICLR25 NeurIPS25
The development of machine learning-based technologies that improve human and environmental health. Previous projects focused on genome sequencing, machine reading, and reducing food waste. Nature Biotech. 14 Nature Medicine 19

PNAS 2025

I am involved in commercializing my research on diffusion language models via Inception. Previously, I co-founded Afresh, a startup that uses AI to significantly drive down food waste—a major environmental problem. Afresh is now deployed in about 10% of US supermarkets. My earlier work on genome sequencing was commercialized by the Stanford spin-off Moleculo, and became part of Illumina's genome phasing service.

I obtained my PhD from Stanford, where I was the recipient of the Arthur Samuel Best Thesis Award. I worked with Stefano Ermon, Serafim Batzoglou, Michael Snyder, Christopher Re, and Percy Liang.

Teaching

Cornell and Cornell Tech

At Cornell, I teach an introductory graduate-level course on machine learning, and a PhD-level course on deep generative models.

CS 5785: Applied Machine Learning
Introductory course that covers both modern and classical algorithms: SVMs, deep learning, boosting, GMMs, etc.

Fall 2025 Fall 2024 Fall 2023 Fall 2022 Fall 2021 Fall 2020

CS 6785: Advanced Topics in Machine Learning: Deep Probabilistic and Generative Models.
Advanced course on variational auto-encoders, generative adversarial networks, probabilistic deep learning, etc..

Spring 2024 Spring 2023 Spring 2022 Spring 2021 Spring 2020

Open Online Courses

I am also the creator of several online courses based on materials I co-authored at Stanford and Cornell. My videos and materials have received 150,000 Youtube views and over 450,000 website vists.

Applied Machine Learning
Overview of modern and classical methods, with a focus on applications, implementation, and iterative development.
Based on Cornell CS 5785.

Website Lecture Videos Lecture Notes Slides and Notebooks

Probabilistic Graphical Models
Graphical representations of probabilities (Bayes Nets, Markov Fields), inference algorithms (variational, MCMC), and more.
Lecture notes for Stanford CS 228, with Stefano Ermon.

Lecture Notes Open Source Repository

Deep Generative Models
Foundations of generative AI algorithms. Overview of main generative modeling families: VAEs, GANs, flows, diffusion models, etc.

Website (Under Construction) Lecture Videos

Selected Awards

Google Research Scholar Award, 2025
Outstanding Paper Award, EMNLP 2023
NIH MIRA Award, 2023
NSF CAREER Award, 2022
Best Paper Honorable Mention, AI for Climate Change at ICML 2019
Arthur Samuel Best Thesis Award in Computer Science, Stanford, 2018

Students and Student Collaborators

PhD Students

Edgar Marroquin
Shachi Deshpande
Yair Schiff
Marianne Arriola
Guanghan Wang
Jacqueline Maasch (co-advised with Fei Wang)

Student Collaborators

Yingheng Wang (CS PhD at Cornell, with Carla Gomes)
Charlie Marx (CS PhD at Stanford, with Stefano Ermon)
Gilad Turok (1st year CS PhD at Cornell Tech)

Alumni and Former Advisees

Aaron Gokaslan (Cornell Tech CS PhD, now at Mohamed bin Zayed University of Artificial Intelligence)
Shachi Deshpande (Cornell Tech CS PhD, now at Microsoft)
Hangyu Zhou (Cornell CS Masters, now PhD Student at Georgia Tech)
Phillip Si (Cornell CS Undergrad, now PhD Student at Georgia Tech)
Hongjun Wu (Masters at Cornell Tech, now founder at stealth startup)
Allan Bishop (Cornell CS masters, now AI Engineer at Bloomberg)
Yong Huang (Cornell Tech CS masters, now PhD Student at UCI)
Evgenii Nikishin (Cornell Tech ORIE masters, now PhD Student at University of Montreal and MILA)

Papers

Selected Papers

Block diffusion: Interpolating between autoregressive and diffusion language models.
Marianne Arriola, Aaron Gokaslan, Justin T Chiu, Zhihan Yang, Zhixuan Qi, Jiaqi Han, Subham Sekhar Sahoo, Volodymyr Kuleshov
International Conference on Learning Representations, 2025 (Oral)

Article Website Code Models

Simple Guidance Mechanisms for Discrete Diffusion Models.
Yair Schiff, Subham Sekhar Sahoo, Hao Phung, Guanghan Wang, Sam Boshar, Hugo Dalla-torre, Bernardo P de Almeida, Alexander Rush, Thomas Pierrot, Volodymyr Kuleshov
International Conference on Learning Representations, 2025

Article Website Code Models Video

Simple and Effective Masked Diffusion Language Models.
Subham Sahoo, Marianne Arriola, Yair Schiff, Aaron Gokaslan, Edgar Marroquin, Justin Chiu, Alexander Rush, Volodymyr Kuleshov
Neural Information Processing Systems, 2024

Article Website Code Models Video Colab

Cross-Species Modeling of Plant Genomes at Single-Nucleotide Resolution Using a Pretrained DNA Language Model.
Jingjing Zhai, Aaron Gokaslan, Yair Schiff, Ana Berthel, Zong-Yan Liu, Wei-Yun Lai, Zachary Miller, Armin Scheben, Michelle Stitzer, Cinta Romay, Edward Buckler, Volodymyr Kuleshov
Proceedings of the National Academy of Sciences, 2025

Article

2025

Probabilistic Graphical Models: A Concise Tutorial
Jacqueline Maasch, Willie Neiswanger, Stefano Ermon, Volodymyr Kuleshov

Article Lecture Notes Open Source Repository

Encoder-Decoder Block Diffusion Language Models for Efficient Training and Inference.
Marianne Arriola, Yair Schiff, Hao Phung, Aaron Gokaslan, Volodymyr Kuleshov
Neural Information Processing Systems, 2025

Article

Remasking Discrete Diffusion Models with Inference-Time Scaling.
Guanghan Wang, Yair Schiff, Subham Sekhar Sahoo, Volodymyr Kuleshov
Neural Information Processing Systems, 2025

Article Website Code Video Colab

Article

The Diffusion Duality
Subham Sekhar Sahoo, Justin Deschenaux, Aaron Gokaslan, Guanghan Wang, Justin Chiu, Volodymyr Kuleshov
International Conference on Machine Learning, 2025

Article

Calibrated Regression Against An Adversary Without Regret
Shachi Deshpande, Charles Marx, Volodymyr Kuleshov
Uncertainty in Artificial Intelligence, 2025

Article

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models.
Marianne Arriola, Aaron Gokaslan, Justin T Chiu, Zhihan Yang, Zhixuan Qi, Jiaqi Han, Subham Sekhar Sahoo, Volodymyr Kuleshov
International Conference on Learning Representations, 2025 (Oral)

Article Website Code Models

Article Website Code Models Video

Denoising Diffusion Variational Inference: Diffusion Models as Expressive Variational Posteriors
Wasu (Top) Piriyakulkij, Yingheng Wang, Volodymyr Kuleshov
Association for the Advancement of Artificial Intelligence, 2025

Article

Calibrated Probabilistic Forecasts for Arbitrary Sequences
Charles Marx, Volodymyr Kuleshov, Stefano Ermon
Transactions on Machine Learning Research, 2025

Article

2024

Article Website Code Models Video Colab

The GAN is dead; long live the GAN! A Modern Baseline GAN
Nick Huang, Aaron Gokaslan, Volodymyr Kuleshov, James Tompkin
Neural Information Processing Systems, 2024

Article

Diffusion Models With Learned Adaptive Noise.
Subham Sekhar Sahoo, Aaron Gokaslan, Chris De Sa, Volodymyr Kuleshov.
Neural Information Processing Systems, 2024 (Spotlight)

Article Website Code

CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images.
Aaron Gokaslan, A Feder Cooper, Jasmine Collins, Landan Seguin, Austin Jacobson, Mihir Patel, Jonathan Frankle, Cory Stephenson, Volodymyr Kuleshov
Computer Vision and Pattern Recognition, 2024

Article Code

Caduceus: Bi-directional equivariant long-range dna sequence modeling.
Yair Schiff, Chia-Hsiang Kao, Aaron Gokaslan, Tri Dao, Albert Gu, Volodymyr Kuleshov
International Conference on Machine Learning, 2024

Article Blog Post Code Models

Quip#: Even better LLM quantization with hadamard incoherence and lattice codebooks.
Albert Tseng, Jerry Chee, Qingyao Sun, Volodymyr Kuleshov, Christopher De Sa
International Conference on Machine Learning, 2024

Article Blog Post Code

DySLIM: Dynamics Stable Learning by Invariant Measure for Chaotic Systems.
Yair Schiff, Zhong Yi Wan, Jeffrey B Parker, Stephan Hoyer, Volodymyr Kuleshov, Fei Sha, Leonardo Zepeda-Núñez
International Conference on Machine Learning, 2024

Article

Local Discovery by Partitioning: Polynomial-Time Causal Discovery Around Exposure-Outcome Pairs.
Jacqueline Maasch, Weishen Pan, Shantanu Gupta, Volodymyr Kuleshov, Kyra Gan, Fei Wang
Uncertainty in Artificial Intelligence, 2024

Article

Calibrated and Conformal Propensity Scores for Causal Effect Estimation.
Shachi Deshpande, Charles Marx, Volodymyr Kuleshov
Uncertainty in Artificial Intelligence, 2024

Article

Online Calibrated and Conformal Prediction Improves Bayesian Optimization.
Shachi Deshpande, Charles Marx, Volodymyr Kuleshov
Artificial Intelligence and Statistics, 2024

Article

2023

ModuLoRA: Finetuning 2-bit LLMs on Consumer GPUs by Integrating with Modular Quantizers.
Junjie Yin, Jiahao Dong, Yingheng Wang, Christopher De Sa, Volodymyr Kuleshov.
Transactions on Machine Learning Research, 2023 (Featured Paper)
Selected for presentation at ICLR 2024

Article System Blog Post

Text embeddings reveal (almost) as much as text.
Jack Morris, Volodymyr Kuleshov, Vitaly Shmatikov, Sasha Rush.
Empirical Methods in Natural Language Processing, 2023 (Outstanding Paper)

Article

Quip: 2-bit quantization of large language models with guarantees.
Jerry Chee, Yaohui Cai, Volodymyr Kuleshov, Christopher De Sa.
Neural Information Processing Systems, 2023 (Spotlight)

Article

InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models.
Yingheng Wang, Yair Schiff, Aaron Gokaslan, Weishen Pan, Fei Wang, Christopher De Sa, Volodymyr Kuleshov.
International Conference on Machine Learning, 2023

Article

Semi-Autoregressive Energy Flows: Exploring Determinant-Free Training of Normalizing Flows.
Philip Si, Zeyi Chen, Subham Sahoo, Yair Schiff, Volodymyr Kuleshov.
International Conference on Machine Learning, 2023

Article

Regularized Data Programming with Automated Bayesian Prior Selection.
Jacqueline Maasch, Hao Zhang, Qian Yang, Fei Wang, Volodymyr Kuleshov.
ICML 2023 Workshop on Structured Probabilistic Inference and Generative Modeling, 2023

Article

Semi Parametric Inducing Point Networks and Neural Processes.
Richa Rastogi, Yair Schiff, Alon Hacohen, Zhaozhi Li, Ian Lee, Yuntian Deng, Mert R. Sabuncu, Volodymyr Kuleshov.
International Conference on Learning Representations, 2023

Article

Backpropagation through Combinatorial Algorithms: Identity with Projection Works.
Subham Sahoo, Anselm Paulus, Marin Vlastelica, Vít Musil, Volodymyr Kuleshov, Georg Martius.
International Conference on Learning Representations, 2023

Article

Harnessing Biomedical Literature to Calibrate Clinicians’ Trust in AI Decision Support Systems.
Qian Yang, Yuexing Hao, Kexin Quan, Stephen Yang, Yiran Zhao, Volodymyr Kuleshov, Fei Wang
Conference on Human Factors in Computing Systems, 2023

Article

2022

Deep Multi-Modal Structural Equations For Causal Effect Estimation With Unstructured Proxies.
Shachi Deshpande, Kaiwen Wang, Dhruv Sreenivas, Zheng Li, Volodymyr Kuleshov.
Neural Information Processing Systems, 2022

Article

Model Criticism for Long-Form Text Generation.
Yuntian Deng, Volodymyr Kuleshov, Sasha Rush.
Empirical Methods in Natural Language Processing, 2022

Article

Calibrated and Sharp Uncertainties in Deep Learning via Density Estimation.
Volodymyr Kuleshov and Shachi Deshpande.
International Conference on Machine Learning, 2022

Article

Autoregressive Quantile Flows for Predictive Uncertainty Estimation.
Philip Si, Allan Bishop, and Volodymyr Kuleshov.
International Conference on Learning Representations, 2022 (Spotlight)

Article

Pre-2022 (Machine Learning)

Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations.
Sawyer Birnbaum*, Volodymyr Kuleshov*, Zayd Enam, Pang Wei Koh, Stefano Ermon.
Neural Information Processing Systems, 2019

Article Code

Calibrated Model-Based Deep Reinforcement Learning.
Ali Malik*, Volodymyr Kuleshov*, Jiaming Song, Danny Nemer, Harlan Seymour, Stefano Ermon.
International Conference on Machine Learning, 2019 (Oral)

Article

Adversarial examples for natural language classification problems.
Volodymyr Kuleshov, Evgenii Nikishin, Shantanu Thakoor, Tingfung Lau, Stefano Ermon.
Manuscript, 2018

Article

Accurate uncertainties for deep learning using calibrated regression.
Volodymyr Kuleshov, Nathan Fenner, Stefano Ermon.
International Conference on Machine Learning, 2018

Article

Adversarial constraint learning for structured prediction.
Hongyu Ren, Russell Stewart, Jiaming Song, Volodymyr Kuleshov, Stefano Ermon.
International Joint Conference on Artificial Intelligence, 2018

Article

Learning with weak supervision from physics and data-driven constraints.
Hongyu Ren, Russell Stewart, Jiaming Song, Volodymyr Kuleshov, Stefano Ermon.
AI Magazine, 2018

Article

Neural variational inference and learning in undirected graphical models.
Volodymyr Kuleshov and Stefano Ermon.
Neural Information Processing Systems, 2017

Article Code

Deep hybrid models: bridging discriminative and generative approaches.
Volodymyr Kuleshov and Stefano Ermon.
Uncertainty in Artificial Intelligence, 2017 (Oral)

Article Slides Poster Code

Audio super-resolution with neural networks.
Volodymyr Kuleshov and Stefano Ermon.
International Conference on Learning Representations (Workshop track), 2017

Article Website Code

Estimating uncertainty online against an adversary.
Volodymyr Kuleshov and Stefano Ermon.
Association for the Advancement of Artificial Intelligence, 2017 (Oral)

Article Poster Code

Calibrated structured prediction.
Volodymyr Kuleshov and Percy Liang.
Neural Information Processing Systems, 2015

Article Poster Codalab

Tensor factorization via matrix factorization.
Volodymyr Kuleshov*, Arun Chaganty*, Percy Liang.
Artificial Intelligence and Statistics, 2015 (Oral)

Article Slides Code Codalab

Inverse game theory: learning utilities in succinct games.
Volodymyr Kuleshov and Okke Schrijvers.
Web and Internet Economics, 2015
World Congress of the Game Theory Society (Contributed Talk), 2016

Article

Algorithms for multi-armed bandit problems.
Volodymyr Kuleshov and Doina Precup.
Manuscript, 2014

Article

Fast algorithms for sparse principal component analysis based on Rayleigh quotient iteration.
Volodymyr Kuleshov.
International Conference on Machine Learning, 2013

Article Code Poster Slides

On the efficiency of the simplest market mechanisms.
Volodymyr Kuleshov and Gordon Wilfong.
Web and Internet Economics, 2012

Article Slides

On the efficiency of markets with two-sided proportional allocation mechanisms.
Volodymyr Kuleshov and Adrian Vetta.
Algorithmic Game Theory, 2010

Article Poster

Pre-2022 (Applications in Science, Health, Sustainability)

A machine-compiled database of genome-wide association studies.
Volodymyr Kuleshov, Jialin Ding, Christopher Vo, Braden Hancock, Alexander Ratner, Yang Li, Christopher Re, Serafim Batzoglou, Michael Snyder
Nature Communications, 2019
Intelligent Systems for Molecular Biology (Bio-Ontologies Track), 2017

Article Code

A guide to deep learning in healthcare
Andre Esteva, Alexandre Robicquet, Bharath Ramsundar, Volodymyr Kuleshov, Mark DePristo, Katherine Chou, Claire Cui, Greg Corrado, Sebastian Thrun, Jeff Dean
Nature Medicine, 2019

Article

Towards a sustainable food supply chain powered by artificial intelligence.
Volodymyr Kuleshov, Harlan Seymour, Danny Nemer, Sean Meador, Nathan Fenner, and Matthew Schwartz.
AI for Climate Change Workshop at ICML, 2019, Honorable Mention for Best Presentation.

Article Video

Lightweight metagenomic species deconvolution using locality-sensitive hashing and Bayesian mixture models.
Victoria Popic, Volodymyr Kuleshov, Serafim Batzoglou, Michael Snyder.
Research in Computational Molecular Biology, 2017

Article

Genome assembly from synthetic long read clouds.
Volodymyr Kuleshov, Serafim Batzoglou, Michael Snyder.
Intelligent Systems for Molecular Biology, 2016

Article Code

High-resolution structure of the human microbiome revealed with synthetic long reads.
Volodymyr Kuleshov, Chao Jiang, Wenyu Zhou, Fereshteh Jahanbani, Serafim Batzoglou, Michael Snyder.
Nature Biotechnology, 2015 (Advance Online Publication)

Article Code

Probabilistic single-individual haplotyping.
Volodymyr Kuleshov.
European Conference on Computational Biology, 2014.

Article Code

Whole-genome haplotyping using long reads and statistical methods.
Volodymyr Kuleshov, Dan Xie, Rui Chen, Dmitry Pushkarev, et al.
Nature Biotechnology, 2014

Article Code Tutorial Slides Video

Contact

Volodymyr Kuleshov
Bloomberg Center, Room 366
2 West Loop Road
New York, NY 10044
E: [last name]@cornell.edu