Volodymyr Kuleshov

Joan Eliasoph, M.D. Assistant Professor
Deparment of Computer Science
Cornell Tech and Cornell University


My research focuses on machine learning and its applications in science, health, and sustainability. It involves two high-level directions:

I am also involved in commercializing my research. I co-founded Afresh, a startup that uses AI to significantly drive down food waste—a major environmental problem. Afresh is now deployed in about 10% of US supermarkets. My earlier work on genome sequencing was commercialized by the Stanford spin-off Moleculo, and became part of Illumina's genome phasing service.

I obtained my PhD from Stanford, where I was the recipient of the Arthur Samuel Best Thesis Award. I worked with Stefano Ermon, Serafim Batzoglou, Michael Snyder, Christopher Re, and Percy Liang.

Teaching


Cornell and Cornell Tech

At Cornell, I teach an introductory graduate-level course on machine learning, and a PhD-level course on deep generative models.

CS 5785: Applied Machine Learning
Introductory course that covers both modern and classical algorithms: SVMs, deep learning, boosting, GMMs, etc.


CS 6785: Advanced Topics in Machine Learning: Deep Probabilistic and Generative Models.
Advanced course on variational auto-encoders, generative adversarial networks, probabilistic deep learning, etc..


Open Online Courses

I am also the creator of several online courses based on materials I co-authored at Stanford and Cornell. My videos and materials have received 150,000 Youtube views and over 450,000 website vists.

Applied Machine Learning
Overview of modern and classical methods, with a focus on applications, implementation, and iterative development.
Based on Cornell CS 5785.


Probabilistic Graphical Models
Graphical representations of probabilities (Bayes Nets, Markov Fields), inference algorithms (variational, MCMC), and more.
Lecture notes for Stanford CS 228, with Stefano Ermon.


Deep Generative Models
Foundations of generative AI algorithms. Overview of main generative modeling families: VAEs, GANs, flows, diffusion models, etc.

Selected Awards


  • Outstanding Paper Award, EMNLP 2023
  • NIH MIRA Award, 2023
  • NSF CAREER Award, 2022
  • Cornell Initiative for Digital Agriculture Research Innovation Award, 2022
  • Best Paper Honorable Mention, AI for Climate Change at ICML 2019
  • Arthur Samuel Best Thesis Award in Computer Science, Stanford, 2018
  • Stanford Graduate Fellowship, 2016

Students and Student Collaborators


PhD Students

  • Edgar Marroquin
  • Shachi Deshpande
  • Yair Schiff
  • Subham Sahoo
  • Aaron Gokaslan
  • Jacqueline Maasch (co-advised with Fei Wang)

Student Collaborators

  • Yingheng Wang (CS PhD at Cornell, with Carla Gomes)
  • Marianne Ariolla (1st year CS PhD at Cornell)
  • Charlie Marx (CS PhD at Stanford, with Stefano Ermon)

Alumni and Former Advisees

  • Allan Bishop (now AI Engineer at Bloomberg)
  • Yong Huang (now PhD Student at UCI)
  • Evgenii Nikishin (now PhD Student at University of Montreal and MILA)
  • Ali Malik (Stanford Undergrad, now PhD Student at Stanford)
  • Jialin Ding (Stanford Undergrad, now PhD Student at MIT)
  • Hongyu Ren (Tsinghua Undergrad, now PhD Student at Stanford)
  • Tingfung Lau (Tsinghua Undergrad, now Masters Student at CMU)
  • Sawyer Birnbaum (Stanford Undergrad, now ML Product Engineer at Afresh)
  • Shantanu Thakoor (Stanford Masters, now Research Engineer at Deepmind)
  • Phillip Si (Cornell Undergrad, now PhD Student at Georgia Tech)
  • Hongjun Wu (Masters at Cornell Tech, now founder at stealth startup)

Papers


Selected Papers

Caduceus: Bi-directional equivariant long-range dna sequence modeling.
Yair Schiff, Chia-Hsiang Kao, Aaron Gokaslan, Tri Dao, Albert Gu, Volodymyr Kuleshov
International Conference on Machine Learning, 2024


A machine-compiled database of genome-wide association studies.
Volodymyr Kuleshov, Jialin Ding, Christopher Vo, Braden Hancock, Alexander Ratner, Yang Li, Christopher Re, Serafim Batzoglou, Michael Snyder
Nature Communications, 2019
Intelligent Systems for Molecular Biology (Bio-Ontologies Track), 2017


A guide to deep learning in healthcare
Andre Esteva, Alexandre Robicquet, Bharath Ramsundar, Volodymyr Kuleshov, Mark DePristo, Katherine Chou, Claire Cui, Greg Corrado, Sebastian Thrun, Jeff Dean
Nature Medicine, 2019


Whole-genome haplotyping using long reads and statistical methods.
Volodymyr Kuleshov, Dan Xie, Rui Chen, Dmitry Pushkarev, et al.
Nature Biotechnology, 2014



Machine Learning

CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images.
Aaron Gokaslan, A Feder Cooper, Jasmine Collins, Landan Seguin, Austin Jacobson, Mihir Patel, Jonathan Frankle, Cory Stephenson, Volodymyr Kuleshov
Computer Vision and Pattern Recognition, 2024


Caduceus: Bi-directional equivariant long-range dna sequence modeling.
Yair Schiff, Chia-Hsiang Kao, Aaron Gokaslan, Tri Dao, Albert Gu, Volodymyr Kuleshov
International Conference on Machine Learning, 2024


Quip#: Even better LLM quantization with hadamard incoherence and lattice codebooks.
Albert Tseng, Jerry Chee, Qingyao Sun, Volodymyr Kuleshov, Christopher De Sa
International Conference on Machine Learning, 2024


DySLIM: Dynamics Stable Learning by Invariant Measure for Chaotic Systems.
Yair Schiff, Zhong Yi Wan, Jeffrey B Parker, Stephan Hoyer, Volodymyr Kuleshov, Fei Sha, Leonardo Zepeda-Núñez
International Conference on Machine Learning, 2024


Local Discovery by Partitioning: Polynomial-Time Causal Discovery Around Exposure-Outcome Pairs.
Jacqueline Maasch, Weishen Pan, Shantanu Gupta, Volodymyr Kuleshov, Kyra Gan, Fei Wang
Uncertainty in Artificial Intelligence, 2024


Calibrated and Conformal Propensity Scores for Causal Effect Estimation.
Shachi Deshpande, Charles Marx, Volodymyr Kuleshov
Uncertainty in Artificial Intelligence, 2024


Online Calibrated and Conformal Prediction Improves Bayesian Optimization.
Shachi Deshpande, Charles Marx, Volodymyr Kuleshov
Artificial Intelligence and Statistics, 2024


ModuLoRA: Finetuning 2-bit LLMs on Consumer GPUs by Integrating with Modular Quantizers.
Junjie Yin, Jiahao Dong, Yingheng Wang, Christopher De Sa, Volodymyr Kuleshov.
Transactions on Machine Learning Research, 2023 (Featured Paper)
Selected for presentation at ICLR 2024


Text embeddings reveal (almost) as much as text.
Jack Morris, Volodymyr Kuleshov, Vitaly Shmatikov, Sasha Rush.
Empirical Methods in Natural Language Processing, 2023 (Outstanding Paper)


Quip: 2-bit quantization of large language models with guarantees.
Jerry Chee, Yaohui Cai, Volodymyr Kuleshov, Christopher De Sa.
Neural Information Processing Systems, 2023 (Spotlight)


Diffusion Models With Learned Adaptive Noise.
Subham Sekhar Sahoo, Aaron Gokaslan, Chris De Sa, Volodymyr Kuleshov.
NeurIPS Workshop on Diffusion Models, 2023


InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models.
Yingheng Wang, Yair Schiff, Aaron Gokaslan, Weishen Pan, Fei Wang, Christopher De Sa, Volodymyr Kuleshov.
International Conference on Machine Learning, 2023


Semi-Autoregressive Energy Flows: Exploring Determinant-Free Training of Normalizing Flows.
Philip Si, Zeyi Chen, Subham Sahoo, Yair Schiff, Volodymyr Kuleshov.
International Conference on Machine Learning, 2023


Regularized Data Programming with Automated Bayesian Prior Selection.
Jacqueline Maasch, Hao Zhang, Qian Yang, Fei Wang, Volodymyr Kuleshov.
ICML 2023 Workshop on Structured Probabilistic Inference and Generative Modeling, 2023


Semi Parametric Inducing Point Networks and Neural Processes.
Richa Rastogi, Yair Schiff, Alon Hacohen, Zhaozhi Li, Ian Lee, Yuntian Deng, Mert R. Sabuncu, Volodymyr Kuleshov.
International Conference on Learning Representations, 2023


Backpropagation through Combinatorial Algorithms: Identity with Projection Works.
Subham Sahoo, Anselm Paulus, Marin Vlastelica, Vít Musil, Volodymyr Kuleshov, Georg Martius.
International Conference on Learning Representations, 2023


Deep Multi-Modal Structural Equations For Causal Effect Estimation With Unstructured Proxies.
Shachi Deshpande, Kaiwen Wang, Dhruv Sreenivas, Zheng Li, Volodymyr Kuleshov.
Neural Information Processing Systems, 2022


Model Criticism for Long-Form Text Generation.
Yuntian Deng, Volodymyr Kuleshov, Sasha Rush.
Empirical Methods in Natural Language Processing, 2022


Calibrated and Sharp Uncertainties in Deep Learning via Density Estimation.
Volodymyr Kuleshov and Shachi Deshpande.
International Conference on Machine Learning, 2022


Autoregressive Quantile Flows for Predictive Uncertainty Estimation.
Philip Si, Allan Bishop, and Volodymyr Kuleshov.
International Conference on Learning Representations, 2022 (Spotlight)


Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations.
Sawyer Birnbaum*, Volodymyr Kuleshov*, Zayd Enam, Pang Wei Koh, Stefano Ermon.
Neural Information Processing Systems, 2019


Calibrated Model-Based Deep Reinforcement Learning.
Ali Malik*, Volodymyr Kuleshov*, Jiaming Song, Danny Nemer, Harlan Seymour, Stefano Ermon.
International Conference on Machine Learning, 2019 (Oral)


Adversarial examples for natural language classification problems.
Volodymyr Kuleshov, Evgenii Nikishin, Shantanu Thakoor, Tingfung Lau, Stefano Ermon.
Manuscript, 2018


Accurate uncertainties for deep learning using calibrated regression.
Volodymyr Kuleshov, Nathan Fenner, Stefano Ermon.
International Conference on Machine Learning, 2018


Adversarial constraint learning for structured prediction.
Hongyu Ren, Russell Stewart, Jiaming Song, Volodymyr Kuleshov, Stefano Ermon.
International Joint Conference on Artificial Intelligence, 2018


Learning with weak supervision from physics and data-driven constraints.
Hongyu Ren, Russell Stewart, Jiaming Song, Volodymyr Kuleshov, Stefano Ermon.
AI Magazine, 2018


Neural variational inference and learning in undirected graphical models.
Volodymyr Kuleshov and Stefano Ermon.
Neural Information Processing Systems, 2017


Deep hybrid models: bridging discriminative and generative approaches.
Volodymyr Kuleshov and Stefano Ermon.
Uncertainty in Artificial Intelligence, 2017 (Oral)


Audio super-resolution with neural networks.
Volodymyr Kuleshov and Stefano Ermon.
International Conference on Learning Representations (Workshop track), 2017


Estimating uncertainty online against an adversary.
Volodymyr Kuleshov and Stefano Ermon.
Association for the Advancement of Artificial Intelligence, 2017 (Oral)


Calibrated structured prediction.
Volodymyr Kuleshov and Percy Liang.
Neural Information Processing Systems, 2015


Tensor factorization via matrix factorization.
Volodymyr Kuleshov*, Arun Chaganty*, Percy Liang.
Artificial Intelligence and Statistics, 2015 (Oral)


Inverse game theory: learning utilities in succinct games.
Volodymyr Kuleshov and Okke Schrijvers.
Web and Internet Economics, 2015
World Congress of the Game Theory Society (Contributed Talk), 2016


Algorithms for multi-armed bandit problems.
Volodymyr Kuleshov and Doina Precup.
Manuscript, 2014


Fast algorithms for sparse principal component analysis based on Rayleigh quotient iteration.
Volodymyr Kuleshov.
International Conference on Machine Learning, 2013


On the efficiency of the simplest market mechanisms.
Volodymyr Kuleshov and Gordon Wilfong.
Web and Internet Economics, 2012


On the efficiency of markets with two-sided proportional allocation mechanisms.
Volodymyr Kuleshov and Adrian Vetta.
Algorithmic Game Theory, 2010



Applications in Science, Health, Sustainability

Harnessing Biomedical Literature to Calibrate Clinicians’ Trust in AI Decision Support Systems.
Qian Yang, Yuexing Hao, Kexin Quan, Stephen Yang, Yiran Zhao, Volodymyr Kuleshov, Fei Wang
Conference on Human Factors in Computing Systems, 2023


A Multi-Modal and Multitask Benchmark in the Clinical Domain.
Edgar Marroquin, Yong Huang, Volodymyr Kuleshov
Manuscript, 2020


A machine-compiled database of genome-wide association studies.
Volodymyr Kuleshov, Jialin Ding, Christopher Vo, Braden Hancock, Alexander Ratner, Yang Li, Christopher Re, Serafim Batzoglou, Michael Snyder
Nature Communications, 2019
Intelligent Systems for Molecular Biology (Bio-Ontologies Track), 2017


A guide to deep learning in healthcare
Andre Esteva, Alexandre Robicquet, Bharath Ramsundar, Volodymyr Kuleshov, Mark DePristo, Katherine Chou, Claire Cui, Greg Corrado, Sebastian Thrun, Jeff Dean
Nature Medicine, 2019


Towards a sustainable food supply chain powered by artificial intelligence.
Volodymyr Kuleshov, Harlan Seymour, Danny Nemer, Sean Meador, Nathan Fenner, and Matthew Schwartz.
AI for Climate Change Workshop at ICML, 2019, Honorable Mention for Best Presentation.


Lightweight metagenomic species deconvolution using locality-sensitive hashing and Bayesian mixture models.
Victoria Popic, Volodymyr Kuleshov, Serafim Batzoglou, Michael Snyder.
Research in Computational Molecular Biology, 2017


Genome assembly from synthetic long read clouds.
Volodymyr Kuleshov, Serafim Batzoglou, Michael Snyder.
Intelligent Systems for Molecular Biology, 2016


High-resolution structure of the human microbiome revealed with synthetic long reads.
Volodymyr Kuleshov, Chao Jiang, Wenyu Zhou, Fereshteh Jahanbani, Serafim Batzoglou, Michael Snyder.
Nature Biotechnology, 2015 (Advance Online Publication)


Probabilistic single-individual haplotyping.
Volodymyr Kuleshov.
European Conference on Computational Biology, 2014.


Whole-genome haplotyping using long reads and statistical methods.
Volodymyr Kuleshov, Dan Xie, Rui Chen, Dmitry Pushkarev, et al.
Nature Biotechnology, 2014


Contact


Volodymyr Kuleshov
Bloomberg Center, Room 366
2 West Loop Road
New York, NY 10044
E: [last name]@cornell.edu