Week |
Date |
Notes, Readings, and HW |
1 |
Tue, Aug 22 |
Introduction
|
|
Thu, Aug 24 |
Optimization and linear algebra refresher
ESL, sec 3.1-3.2
ALA, sec 3.2-3.2
|
2 |
Tue, Aug 29 |
Regularized linear least squares
|
|
Thu, Aug 31 |
Sparse least squares and iterations
|
3 |
Tue, Sep 05 |
Stochastic gradients, scaling, and Newton
|
|
Thu, Sep 07 |
Randomized numerical linear algebra
Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions, Halko, Martinsson, and Tropp, SIREV, 2011.
LSRN: A Parallel Iterative Solver for Strongly Over- or Under-Determined Systems, Meng, Saunders, Mahoney, SISC 2014
Sec 5, Lectures on Randomized NLA, Drineas and Mahoney
|
4 |
Tue, Sep 12 |
Latent factor models
|
|
Thu, Sep 14 |
SVD and other low-rank decompositions
On the relationships between SVD, KLT, and PCA, Gerbrands, Pattern Recognition, 1981
Trace optimization and eigenproblems in dimension reduction methods, Kikiopoulou, Chen, and Saad, NLAA 2010
On the compression of low rank matrices, Cheng, Gimbutas, Martinsson, and Rokhlin, SISC 2005
CUR matrix decompositions for improved data analysis, Mahoney and Drineas, PNAS 2009
|
5 |
Tue, Sep 19 |
Non-negative matrix factorization
Nonnegative Matrix Factorization (Gillis), Chapter 1
The Whys and Hows of NMF, Gillis
Learning the parts of objects by non-negative matrix factorization, Lee and Seung, Nature, 1999
Computing a nonnegative matrix factorization – provably, Arora, Ge, Kannan, and Moitra, SICOMP, 2016
When Does NMF Give a Correct Decomposition into Parts?, Donoho and Stodden, NeurIPS, 2003
Algorithms for NMF and NTFs: a unified view based on block coordinate descent framework, Kim, He, and Park, J. Glob. Optim, 20113
|
|
Thu, Sep 21 |
Tensor basics, HOSVD, Tucker, and ALS
Tensor Decompositions and Applications, Kolda and Bader, SIREV, 2009
Tensor Computations and Applications in Data Mining, Elden, slides from SIAM AM 2008
From Matrix to Tensor, Van Loan, slides from Cornell CS colloquium
Tensors for Data Mining and Data Fusion, Papalexakis, Faloutsos, and Sidriropoulos, ACM TIS, 2016
|
6 |
Tue, Sep 26 |
CP decomposition and algorithms, CUR and tensor trains
Tensor Decompositions and Applications, Kolda and Bader, SIREV, 2009
Tensor Decompositions: A Mathematical Tool for Data Analysis, Kolda, slides from JMM 2018
Epsilon-ALS for Orthogonal Low-Rank Tensor Approximation, Yang, SIMAX 2020
Low Multilinear Rank Approximations of Tensors, Che, Wei, and Yan, SIMAX 2020
Low-Rank Approximation in the Frobenius Norm by Column and Row Subset Selection, Cortinovis and Kressner, SIMAX 2020
Stochastic Gradients for Large-Scale Tensor Decomposition, Kolda and Hong, SIMODS 2020
|
|
Thu, Sep 28 |
Nonlinear dimensionality reduction
A global geometric framework for nonlinear dimensionality reduction, Tenenbaum, de Silva, and Langford, Science 2000
Nonlinear dimensionality reduction by locally linear embedding, Roweis and Saul, Science 2000
Visualizing Data using t-SNE, van der Maaten and Hinton, JMLR 2008
Dimensionality Reduction: A Comparitive Review, van der Maaten, Postma, and van den Herik, Tech report 2009
Dimension Reduction: A Guided Tour, Burges, FTML 2009
Global versus local methods in nonlinear dimensionality reduction, de Silva and Tenenbaum, NeurIPS 2003
Large-scale SVD and manifold learning, Talwalkar, Kumar, Mohri, and Rowley, JMLR 2013
Accelerating t-SNE using tree-based algorithms, van der Maaten, JMLR 2014
|
7 |
Tue, Oct 03 |
Function approximation fundamentals
Nonlinear Approximation, DeVore, Acta Numerica 1998 - long, but please do read sections 1 and 9 at least
Approximation Theory and Approximation Practice, Trefethen, SIAM 2019 - a beautiful text, focused on polynomial and rational approximation in 1D; useful to skim, don’t consider it assigned reading
A Course in Approximation Theory, Cheney and Light, AMS 2009 - again, not considered assigned reading (unless you want to do DNN approximation, in which case please read ch 23-25)
|
|
Thu, Oct 05 |
Low-dim structure in function approximation
Active Subspaces: Emerging Ideas for Dimension Reduction in Parameter Studies, Constantine, SIAM 2015
Active Subspace Methods in Theory and Practice: Applications to Kriging Surfaces, Constantine, Dow, and Wang, SISC 2014
Active Manifolds: a non-linear analogue to Active Subspaces, Bridges, Gruber, Felder, Verma, Hoff, ICML 2019
Constrained global optimization of functions with low effective dimensionality using multiple random embeddings, Cartis, Massart, Otemissov, arXiv 2020
|
8 |
Tue, Oct 10 |
Fall break |
|
Thu, Oct 12 |
Low-dim structure in function approximation
Approximation of high-dimensional parametric PDEs, Cohen, DeVore, Acta Numerica 2015
Model reduction via proper orthogonal decomposition, Pinnau, in Model Order Reduction: Theory, Research Aspects and Applications, Springer 2008
Nonlinear model reduction via discrete empirical interpolation, Chaturantabut, Sorensen, SISC 2010
|
9 |
Tue, Oct 17 |
Many interpretations of kernels
ESL, sec 14.5.4
Kernel techniques: From machine learning to meshless methods, Schaback and Wendland, Acta Numerica 2006
Gaussian Processes for Machine Learning, Rasumussen and Williams, 2006 - read Ch 1
Kernel Methods in ML, Hoffman, Scholkopf, Smola, Annals of Statistics, 2008
Spline Models for Observational Data, Wahba, SIAM 1990 - read the foreword in particular
|
|
Thu, Oct 19 |
Approaches to kernel selection
|
10 |
Tue, Oct 24 |
Computing with kernels
|
|
Thu, Oct 26 |
Scalable kernel methods
Kernel Interpolation for Scalable Structured GPs, Wilson and Nickisch, ICML 2015
Scalable Log Determinants for GP Kernel Learning, Eriksson et al, NeurIPS 2017
Scaling GP Regression with Derivatives, Dong et al, NeurIPS 2018
Exact GPs on a Million Data Points, Wang et al, NeurIPS 2019
Fast estimation of tr(f(A)) via stochastic Lanczos quadrature, Ubaru, Chen, and Saad, SIMAX 2017
|
11 |
Tue, Oct 31 |
Matrices associated with graphs
|
|
Thu, Nov 02 |
Function approximation on graphs
Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions, Zhu, Gharhraman, and Lafferty, ICML 2003
Learning with Local and Global Consistency, Zhou, NeurIPS 2004
Empirical stationary correlations for semi-supervised learning on graphs, Xu, Dyer, and Owen, Ann Appl Stat, 2010
Using Local Spectral Methods to Robustify Graph-Based Learning Algorithms, Gleich and Mahoney, KDD 2015
|
12 |
Tue, Nov 07 |
Graph clustering and partitioning
A tutorial on spectral clustering, von Luxburg, Statistics and Computing 2007
Communities in networks, Porter, Onnela, and Mucha, Notices of the AMS, 2009
Community detection in networks: A user guide, Fortunato and Hric, Physics Reports, 2016
Trace optimization and eigenproblems in dimension reduction methods, Kokiopoulou, Chen, and Saad, NLAA, 2011
|
|
Thu, Nov 09 |
Centrality measures
|
13 |
Tue, Nov 14 |
Learning linear system dynamics
|
|
Thu, Nov 16 |
Learned dynamics and extrapolation
|
14 |
Tue, Nov 21 |
Koopman theory and lifting
|
|
Thu, Nov 23 |
Thanksgiving break |
15 |
Tue, Nov 28 |
Learning nonlinear dynamics
Discovering governing equations from data by sparse identification of nonlinear dynamical systems, Brunton, Proctor, Kutz, PNAS 2016
A Data-Driven Approximation of the Koopman Operator: Extending Dynamic Mode Decomposition, Williams, Kevrekidis, Rowley, J Nonlinear Science 2015
A Kernel-Based Method for Data-Driven Koopman Spectral Analysis, Williams, Rowley, Kevrekidis, J Comp Dynamics 2015
|
|
Thu, Nov 30 |
Special topics
|