# CS 6241

## Numerical Methods for Data Science

Prof: David Bindel

Email: bindel@cornell.edu

OH: Mon 2:00-4:00, Fri 12:30-1:30 and 2:00-3:00, or by appt

Scheduler link

## News

**2021-03-15:**
Project proposal prompt is posted (due March 22).

**2021-03-01:**
Reaction paper prompt is posted (due March 15).

**2021-03-01:**
Office hours for today (March 1) are canceled.

**2021-02-25:**
No class today or OH tomorrow, reaction paper guidelines to be posted before the weekend.

**2021-02-18:**
Office hours for tomorrow (Feb 19) are 11:30-1:30.

## Overview

In this class, we treat numerical methods underlying a variety of modern machine learning and data analysis techniques. The course consists of six units of roughly two weeks each:

**Least squares and regression**: direct and iterative linear and nonlinear least squares solvers; direct randomized approximations and preconditioning; Newton, Gauss-Newton, and IRLS methods for nonlinear problems; regularization; robust regression.**Matrix and tensor data decompositions**: direct methods, iterations, and randomized approximations for SVD and related decomposition methods; nonlinear dimensionality reduction; non-negative matrix factorization; tensor decompositions.**Low-dimensional structure in function approximation**: active subspace / sloppy model approaches to identifying the most relevant parameters in high-dimensional input spaces and model reduction approaches to identifying low-dimensional structure in high-dimensional output spaces.**Kernel interpolation and Gaussian processes**: statistical and deterministic interpretations and error analysis for kernel interpolation; methods for dealing with ill-conditioned kernel systems; and methods for scalable inference and kernel hyper-parameter learning.**Numerical methods for graph data**: implication of different graph structures for linear solvers; graph-based coordinate embedding methods; analysis methods based on matrix functions; computation of centrality measures; and spectral methods for graph partitioning and clustering.**Learning models of dynamics**: system identification and auto-regressive model fitting; Koopman theory; dynamic mode decomposition.

See the syllabus for more information on course logistics.