Volodymyr Kuleshov

Assistant Professor
Deparment of Computer Science
Cornell Tech


My research focuses on machine learning and its applications in science, health, and sustainability. Some of my projects/interests include:

  • New deep learning and deep generative models for massive scientific datasets that help accelerate research, particularly in the biological sciences and genomics. Nature Medicine 19
  • Machine reading systems that help make scientific knowledge easily accessible to researchers and clinicians Nature Comm. 19 Github
  • New genome sequencing technologies that combine existing wetlab techniques with new statistical methods, thus making them significantly more affordable and accurate Nature Biotech. 14 Nature Biotech. 16

These projects motivate core machine learning research in deep learning, probabilistic methods, approximate inference, decision-making under uncertainty. NeurIPS17 ICML18 ICML19

I am also involved in commercializing my research. I am the co-founder and Chief Technologist at Afresh, a startup that uses AI to automate operations in hundreds of grocery stores across the US, significantly driving down food waste—a major environmental problem. In 2012-2013, I spent a year off as first engineer at Stanford spin-out Moleculo, where I developed machine learning algorithms that now power Illumina's genome phasing service.

I obtained my PhD from Stanford, working with Stefano Ermon, Serafim Batzoglou, Michael Snyder, Christopher Re, and Percy Liang, and I was the recipient of the Arthur Samuel Best Thesis Award.

Teaching


Cornell and Cornell Tech

At Cornell, I teach an introductory graduate-level course on machine learning, and a PhD-level course on deep generative models.

CS 5785: Applied Machine Learning
Introductory course that covers both modern and classical algorithms: SVMs, deep learning, boosting, GMMs, etc.


CS 6785: Advanced Topics in Machine Learning: Deep Probabilistic and Generative Models.
Advanced course on variational auto-encoders, generative adversarial networks, probabilistic deep learning, etc..


Open Online Courses

I am also the creator of several online courses based on materials I co-authored at Stanford and Cornell. My videos and materials have received 45,000 Youtube views and over 400,000 website vists.

Applied Machine Learning
Overview of modern and classical methods, with a focus on applications, implementation, and iterative development.
Based on Cornell CS 5785.


Probabilistic Graphical Models
Graphical representations of probabilities (Bayes Nets, Markov Fields), inference algorithms (variational, MCMC), and more.
Lecture notes for Stanford CS 228, with Stefano Ermon.


Deep Generative Models
Coming in early 2022, stay tuned!

Students and Student Collaborators


PhD Students

  • Edgar Marroquin
  • Shachi Deshpande

Student Collaborators

  • Phillip Si (Undegraduate at Cornell)
  • Ali Kayyal (Undegraduate at the Technion)
  • Zheng Li (CS PhD at Cornell)
  • Richa Rastogi (Rotation CS PhD at Cornell)
  • Jacqueline Maasch (Rotation CS PhD at Cornell Tech)
  • Yuntian Deng (CS PhD at Harvard)

Alumni and Former Advisees

  • Allan Bishop (now AI Engineer at Bloomberg)
  • Yong Huang (now PhD Student at UCI)
  • Evgenii Nikishin (now PhD Student at University of Montreal and MILA)
  • Ali Malik (Stanford Undergrad, now PhD Student at Stanford)
  • Jialin Ding (Stanford Undergrad, now PhD Student at MIT)
  • Hongyu Ren (Tsinghua Undergrad, now PhD Student at Stanford)
  • Tingfung Lau (Tsinghua Undergrad, now Masters Student at CMU)
  • Sawyer Birnbaum (Stanford Undergrad, now ML Product Engineer at Afresh)
  • Shantanu Thakoor (Stanford Masters, now Research Engineer at Deepmind)

Papers


Selected Papers

A machine-compiled database of genome-wide association studies.
Volodymyr Kuleshov, Jialin Ding, Christopher Vo, Braden Hancock, Alexander Ratner, Yang Li, Christopher Re, Serafim Batzoglou, Michael Snyder
Nature Communications, 2019
Intelligent Systems for Molecular Biology (Bio-Ontologies Track), 2017


A guide to deep learning in healthcare
Andre Esteva, Alexandre Robicquet, Bharath Ramsundar, Volodymyr Kuleshov, Mark DePristo, Katherine Chou, Claire Cui, Greg Corrado, Sebastian Thrun, Jeff Dean
Nature Medicine, 2019


High-resolution structure of the human microbiome revealed with synthetic long reads.
Volodymyr Kuleshov, Chao Jiang, Wenyu Zhou, Fereshteh Jahanbani, Serafim Batzoglou, Michael Snyder.
Nature Biotechnology, 2016


Whole-genome haplotyping using long reads and statistical methods.
Volodymyr Kuleshov, Dan Xie, Rui Chen, Dmitry Pushkarev, et al.
Nature Biotechnology, 2014


Machine Learning

Calibrated and Sharp Uncertainties in Deep Learning via Simple Density Estimation.
Volodymyr Kuleshov and Shachi Deshpande.
Manuscript, 2021


Calibration Improves Bayesian Optimization.
Shachi Deshpande and Volodymyr Kuleshov.
Manuscript, 2021


Understanding Adversarial Examples in Discrete Input Spaces, With Applications to Computational Biology.
Volodymyr Kuleshov, Evgenii Nikishin, Shantanu Thakoor, Tingfung Lau, Stefano Ermon.
Manuscript, 2021


Autoregressive Quantile Flows for Predictive Uncertainty Estimation.
Philip Si, Allan Bishop, and Volodymyr Kuleshov.
Manuscript, 2021


Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations.
Sawyer Birnbaum*, Volodymyr Kuleshov*, Zayd Enam, Pang Wei Koh, Stefano Ermon.
Neural Information Processing Systems, 2019


Calibrated Model-Based Deep Reinforcement Learning.
Ali Malik*, Volodymyr Kuleshov*, Jiaming Song, Danny Nemer, Harlan Seymour, Stefano Ermon.
International Conference on Machine Learning, 2019


Accurate uncertainties for deep learning using calibrated regression.
Volodymyr Kuleshov, Nathan Fenner, Stefano Ermon.
International Conference on Machine Learning, 2018


Adversarial constraint learning for structured prediction.
Hongyu Ren, Russell Stewart, Jiaming Song, Volodymyr Kuleshov, Stefano Ermon.
International Joint Conference on Artificial Intelligence, 2018


Learning with weak supervision from physics and data-driven constraints.
Hongyu Ren, Russell Stewart, Jiaming Song, Volodymyr Kuleshov, Stefano Ermon.
AI Magazine, 2018


Neural variational inference and learning in undirected graphical models.
Volodymyr Kuleshov and Stefano Ermon.
Neural Information Processing Systems, 2017


Deep hybrid models: bridging discriminative and generative approaches.
Volodymyr Kuleshov and Stefano Ermon.
Uncertainty in Artificial Intelligence, 2017


Audio super-resolution with neural networks.
Volodymyr Kuleshov and Stefano Ermon.
International Conference on Learning Representations (Workshop track), 2017


Estimating uncertainty online against an adversary.
Volodymyr Kuleshov and Stefano Ermon.
Association for the Advancement of Artificial Intelligence, 2017


Calibrated structured prediction.
Volodymyr Kuleshov and Percy Liang.
Neural Information Processing Systems, 2015


Tensor factorization via matrix factorization.
Volodymyr Kuleshov*, Arun Chaganty*, Percy Liang.
Artificial Intelligence and Statistics, 2015


Inverse game theory: learning utilities in succinct games.
Volodymyr Kuleshov and Okke Schrijvers.
Web and Internet Economics, 2015
World Congress of the Game Theory Society (Contributed Talk), 2016


Algorithms for multi-armed bandit problems.
Volodymyr Kuleshov and Doina Precup.
Manuscript, 2014


Fast algorithms for sparse principal component analysis based on Rayleigh quotient iteration.
Volodymyr Kuleshov.
International Conference on Machine Learning, 2013


On the efficiency of the simplest market mechanisms.
Volodymyr Kuleshov and Gordon Wilfong.
Web and Internet Economics, 2012


On the efficiency of markets with two-sided proportional allocation mechanisms.
Volodymyr Kuleshov and Adrian Vetta.
Algorithmic Game Theory, 2010



Applications in Genomics, Health, Sustainability

Clinical Evidence Engine: A Domain-Agnostic Decision Support Infrastructure.
Bojian Hou, Hao Zhang, Gur Ladizhinsky, Stephen Yang, Volodymyr Kuleshov, Fei Wang, Qian Yang
Manuscript, 2021


A Multi-Modal and Multitask Benchmark in the Clinical Domain.
Edgar Marroquin, Yong Huang, Volodymyr Kuleshov
Manuscript, 2020


A machine-compiled database of genome-wide association studies.
Volodymyr Kuleshov, Jialin Ding, Christopher Vo, Braden Hancock, Alexander Ratner, Yang Li, Christopher Re, Serafim Batzoglou, Michael Snyder
Nature Communications, 2019
Intelligent Systems for Molecular Biology (Bio-Ontologies Track), 2017


A guide to deep learning in healthcare
Andre Esteva, Alexandre Robicquet, Bharath Ramsundar, Volodymyr Kuleshov, Mark DePristo, Katherine Chou, Claire Cui, Greg Corrado, Sebastian Thrun, Jeff Dean
Nature Medicine, 2019


Towards a sustainable food supply chain powered by artificial intelligence.
Volodymyr Kuleshov, Harlan Seymour, Danny Nemer, Sean Meador, Nathan Fenner, and Matthew Schwartz.
AI for Climate Change Workshop at ICML, 2019, Honorable Mention for Best Presentation.


Lightweight metagenomic species deconvolution using locality-sensitive hashing and Bayesian mixture models.
Victoria Popic, Volodymyr Kuleshov, Serafim Batzoglou, Michael Snyder.
Research in Computational Molecular Biology, 2017


Genome assembly from synthetic long read clouds.
Volodymyr Kuleshov, Serafim Batzoglou, Michael Snyder.
Intelligent Systems for Molecular Biology, 2016


High-resolution structure of the human microbiome revealed with synthetic long reads.
Volodymyr Kuleshov, Chao Jiang, Wenyu Zhou, Fereshteh Jahanbani, Serafim Batzoglou, Michael Snyder.
Nature Biotechnology, 2015 (Advance Online Publication)


Probabilistic single-individual haplotyping.
Volodymyr Kuleshov.
European Conference on Computational Biology, 2014.


Whole-genome haplotyping using long reads and statistical methods.
Volodymyr Kuleshov, Dan Xie, Rui Chen, Dmitry Pushkarev, et al.
Nature Biotechnology, 2014


Contact


Volodymyr Kuleshov
Bloomberg Center, Room 366
2 West Loop Road
New York, NY 10044
E: [last name]@cornell.edu