CS 6220: Data-sparse matrix computations


Instructor: Anil Damle

Contact: damle@cornell.edu

Office hours: Tuesday and Thursday (starting August 24) from 10:30 AM - 11:30 AM in 423 Gates Hall, or by appointment. No office hours will be held on November 7 or 9 due to instructor travel

Lectures: Tuesday and Thursday from 1:25 pm till 2:40 pm in Hollister Hall 320

Course overview: Matrices and linear systems can be data-sparse in a wide variety of ways, and we can often leverage such underlying structure to perform matrix computations efficiently. This course will discuss several varieties of structured problems and associated algorithms. Example topics include randomized algorithms for numerical linear algebra, Krylov subspace methods, sparse recovery, and assorted matrix factorizations.

News and important dates


  • November 7 — No lecture
  • November 9 — Special lecture by Chris De Sa
  • December 11 at 4:30 PM — Project report due

Homework/Project


Your grade in this course is comprised of two components: scribing lectures (think of these as individualized homeworks) and a course project. The final grade will be heavily weighted towards the project. While these assignments constitute the workload for the course, I encourage you to look further into the given references for any topics that you find particularly interesting. There is far more to many of these topics than we will have time to cover in this course.

  • Lecture scribing:

    You will be responsible for scribing two lectures given in this course. Scribing a lecture involves not just taking good notes and typesetting them, but also elaborating on points you find interesting, filling in details, example implementations and simple numerical experiments (if appropriate), and adding references as needed. Therefore, I recommend picking a topic you find interesting as this can provide a good opportunity to learn more. It would be reasonable to think of these as two homeworks for the class. An example set of scribed notes (not from this class but on a relevant topic) may be found here.

    The notes will be due one week after the class you are scribing occurs (let me know if you need a short extension) and will be posted online. In lieu of immediate grading of the notes, I will provide feedback and, potentially, ask for a revised version in a timely manner.

    The current lecture scribing assignments are available here.

  • Course project:

    As part of this course you will be required to complete a course project on a topic of your choosing (some examples will be added here in the near future and given in class). This may be done individually or in groups of up to three students (flexible), though the scope should be scaled accordingly. While the project should contain a mix of implementation, application examples, and theoretical discussion. However, depending on the specific topic the relative composition of the project may vary as appropriate. Further details are available here.

    As part of the project you will write a report and present your work in the final week of the course. The specifics of how the presentations work logistically is subject to change pending the final enrollment count for the course. A two page proposal is due on October 26 (I am happy to discuss ideas at any time) and the final report is due on December 11 at 4:30 PM

References


There is no one textbook that covers all of the material for this course — in fact, there is no required text for this course. In lieu of that, a collection of references (textbooks, papers, and notes) will be progressively added that cover various aspects of the course.

  • Matrix Computations by Golub and Van Loan
    This text covers much of the material discussed in the Krylov subspace methods section of the course and is a good reference text for numerical linear algebra in general.
  • Numerical Linear Algebra by Trefethen and Bau
    Another good textbook that broadly covers numerical linear algebra. It is written more colloquially than Golub and Van Loan, and some find it provides a smoother introduction into some of the topics. Conversely, it is not as extensive of a reference text.
  • A short course on fast multipole methods by Rick Beatson and Leslie Greengard
    These lecture notes provide a good introduction to the FMM, cover the algorithm in three dimensions, and includes more detail than presented in this courses lectures. [Online]
  • Fast Algorithms for Boundary Integral Equations by Lexing Ying
    This book chapter provides both an introduction to boundary integral equations and a discussion of various fast algorithms constructed to solve them. Notably, this includes the FMM, some of its variants, rank-structured techniques, and more. [Online]
  • Rank-structured matrix papers
    There are several good references for rank-structured matrix algorithms of the variety we discussed in class. A selection includes: the work by Martinsson and Rokhlin in 2005 [PDF], the multiplicative recursive skeletonization factorization by Ho and Greengard in 2012 [PDF], a more thorough treatment of the low-rank compression scheme by Cheng et. al. in 2005 [PDF], and a recent paper on a strongly admissible recursive skeletonization factorization by Minden et. al. in 2017 [PDF]. The last reference is essentially what we covered in class, contains numerous references to previous work, and section 3.2 describes the use of a so-called proxy surface in some detail.
  • Rank-structured matrix code
    The FLAM package mentioned in class is available online as is the code for the recursive skeletonization based on strong admissibility.
  • Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions by Halko, Martinsson, and Tropp
    This SIAM Review article provides a nice introduction to randomized methods in numerical linear algebra. [SIAM Online and arXiv]
  • CUR matrix decompositions for improved data analysis by Mahoney and Drineas
    This PNAS paper provides a brief introduction to the CUR factorization, some applications, and references to the more detailed works in the area. [PNAS Online]
  • LSRN by Meng, Saunders, and Mahoney
    This paper presents the LSRN method, an iterative solver utilizing randomness that solves least squares problems. [Paper and code]
  • Low-rank approximation and regression in input sparsity time by Clarkson and Woodruff
    This paper presents and analyzes the sparse random matrices we discussed in class. [Journal and arXiv]
  • Atomic decomposition by basis pursuit by Chen, Donoho and Saunders
    This paper presents the basis pursuit denoising algorithms we discussed in class and gives comparisons with other methods. [Journal]
  • Sparse and redundant representations by Elad
    This book is a good resource for the sparse recovery discussions we have had in class and contains many of the theoretical statements with proofs. It should be available online in PDF form via the Cornell library [Springer online]
  • Optimization resources
    There are many algorithms and techniques for solving the problems we discussed, including PDCO, ASP, CVX, TFOCS, and many more.
  • October 31 class demo
    The code from the course demo may be found here. It depends on the ASP optimization package to solve the BPDN problems.
  • Low rank plus sparse and robust PCA
    Two papers relevant to our discussion in class are Candès, Li, Ma, and Wright 2011 [Journal link] and Chandrasekaran, Sanghavi, Parrilo, and Willsky 2009 [Journal link].
  • Fast Laplacian solvers
    The paper containing the algorithms discussed in class is Kyng and Sachdeva [arXiv].

Schedule


A tentative, high-level schedule for the course is available here — it may be subject to minor revisions.

Notes


Scribed notes will be added here throughout the course.

Lecture 1 — introduction and course overview

Lecture 2

Lecture 3

Lecture 4

Lecture 5

Lecture 6

Lecture 7

Lecture 8

Lecture 8, set 2

Lecture 9

Lecture 10

Lecture 11

Lecture 12

Lecture 13

Lecture 14

Lecture 15 — no class due to fall break

Lecture 16

Lecture 17

Lecture 18

Lecture 19

Lecture 20

Lecture 21, set one

Lecture 21, set two

Lecture 22

Lecture 23 — no class due to instructor travel

Lecture 24 — no scribed notes from the guest lecture

Lecture 25

Lecture 25

Lecture 27

Miscellany


The Cornell Code of Academic Integrity applies to this course.

In compliance with the Cornell University policy and equal access laws, I am available to discuss appropriate academic accommodations that may be required for student with disabilities. Requests for academic accommodations are to be made during the first three weeks of the semester, except for unusual circumstances, so arrangements can be made. Students are encouraged to register with Student Disability Services to verify their eligibility for appropriate accommodations.