CS 6220: Data-sparse matrix computations


Instructor: Anil Damle

Contact: damle@cornell.edu

Office hours: Tuesdays from 3 - 4 PM and Wednesdays from 11 AM - 12 PM

Lectures: Tuesdays and Thursdays from 1:25 pm till 2:40 pm in Hollister Hall 206

Course overview: Matrices and linear systems can be data-sparse in a wide variety of ways, and we can often leverage such underlying structure to perform matrix computations efficiently. This course will discuss several varieties of structured problems and associated algorithms. Example topics include randomized algorithms for numerical linear algebra, Krylov subspace methods, sparse recovery, and assorted matrix factorizations.

News and important dates


  • January 20 — This website is under construction and information on it shoudl be considered tentative and is subject to change.
  • January 21 — First day of class
  • May 14 at 4:30 PM — Project report due

Homework/Project


Your grade in this course is comprised of three components: scribing lectures (think of these as individualized homeworks) and participating in peer review, writing a paper review, and a course project. The final grade will be heavily weighted towards the project. While these assignments are constitute the formal workload for the course, I encourage you to look further into the given references for any topics that you find particularly interesting. There is far more to many of these topics than we will have time to cover in this course.

  • Lecture scribing and peer review:

    You will be responsible for scribing two lectures given in this course and reviewing two sets of lecture notes. Scribing a lecture involves not just taking good notes and typesetting them, but also elaborating on points you find interesting, filling in details, example implementations and simple numerical experiments (if appropriate), and adding references as needed. Therefore, I recommend picking a topic you find interesting as this can provide a good opportunity to learn more. It would be reasonable to think of these as two homeworks for the class. An example set of scribed notes (not from this class but on a relevant topic) may be found here.

    Peer review of lectures notes consists of carefully reading through another students scribed notes and providing feedback (typos, suggestions for improvement, etc.) to them so they can iterate on the notes before they are final. The purpose of this assignment is both to ensure the quality of notes (they will be made available on the course website for others in the class) and provide experience providing written feedback on technical material.

    The first draft of the notes should be completed within one week after the class you are scribing occurs and the peer review process including revisions should occur within one week after that. As with most due dates in this class (save for the final report) I am flexible — let me know if you need a short extension. Once this process is completed I should receive the original draft of the notes, the review, and the revised version so I may peruse them myself and add the final version to the course website. I am happy to aid in this process and please let me know if you have any questions along the way. I will not be grading the notes per se and this portion of the class grade is essentially based on completion (if I feel improvements are needed to a set of notes I will ask for them).

    Details on the logistics of this process will be available here sometime after the first day of class.

  • Paper review:

    As part of this course you will have to complete and submit one paper review. In particular, at any point in this class you should select a paper relevant to topics we are discussing (I am happy to provide suggestions and there will be many on the course website) and write a brief review of it. A review consists of a summary discussion of the key points of the paper (theorems, experiments, etc.) and why you think they are interesting. This must be turned in by the last day of class, though I would encourage you to submit it in a timely manner with respect to our discussion of the topic the paper you choose covers. It is perfectly fine to align this assignment with either the lecture scribing or your project. For example, you could review a paper particularly relevant to the lecture(s) you scribe and/or choose one closely related to your project.

  • Course project:

    As part of this course you will be required to complete a course project on a topic of your choosing (some examples will be added here in the near future and given in class). This may be done individually or in groups of up to three students (flexible), though the scope should be scaled accordingly. While the project should contain a mix of implementation, application examples, and theoretical discussion. However, depending on the specific topic the relative composition of the project may vary as appropriate. Further details are available here.

    As part of the project you will write a report and present your work in the final week of the course. The specifics of how the presentations work logistically is subject to change pending the final enrollment count for the course. A two page proposal is due on March 27 at 5 PM (I am happy to discuss ideas at any time) and the final report is due on May 14 at 4:30 PM

References


There is no one textbook that covers all of the material for this course — in fact, there is no required text for this course. In lieu of that, a collection of references (textbooks, papers, and notes) will be progressively added that cover various aspects of the course.

Books

  • Matrix Computations by Golub and Van Loan
    This text is the standard reference text in numerical linear algebra and therefore valuable as a backstop for much of what we will talk about in this courses
  • Numerical Linear Algebra by Trefethen and Bau
    A good textbook that broadly covers numerical linear algebra at the graduate level. It is well written, certainly more colloquial the Golub and Van Loan, and many find it provides a smoother introduction to the topics we will discuss. However, by construction, it is not as comprehensive as Golub and Van Loan. This book is available online through the library. [Cornell library]
  • Applied Numerical Linear Algebra by Demmel
    A good textbook that covers numerical linear algebra at the graduate level. It may prove useful as a secondary source of information for some of the material, or as a vehicle for further study. This book is also available online through through Cornell's SIAM subscription. [SIAM online]

Papers

  • A short course on fast multipole methods by Rick Beatson and Leslie Greengard
    These lecture notes provide a good introduction to the FMM, cover the algorithm in three dimensions, and includes more detail than presented in this courses lectures. [Online]
  • Fast Algorithms for Boundary Integral Equations by Lexing Ying
    This book chapter provides both an introduction to boundary integral equations and a discussion of various fast algorithms constructed to solve them. Notably, this includes the FMM, some of its variants, rank-structured techniques, and more. [Online]
  • Rank-structured matrix papers
    There are several good references for rank-structured matrix algorithms of the variety we discussed in class. A selection includes: the work by Martinsson and Rokhlin in 2005 [PDF], the recursive skeletonization factorization by Ho and Greengard in 2012 [PDF], a more thorough treatment of the low-rank compression scheme by Cheng et. al. in 2005 [PDF], and a recent paper on a strongly admissible recursive skeletonization factorization by Minden et. al. in 2017 [PDF]. The last reference is essentially what we covered in class, contains numerous references to previous work, and section 3.2 describes the use of a so-called proxy surface in some detail.
  • Rank-structured matrix code
    The FLAM package mentioned in class is available online as is the code for the recursive skeletonization based on strong admissibility.
  • Randomized Numerical Linear Algebra: Foundations & Algorithms by Martinsson and Tropp
    This monograph provides a nice, recent introduction to randomized methods in numerical linear algebra and related fields. [arXiv]
  • Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions by Halko, Martinsson, and Tropp
    This SIAM Review article provides a nice introduction to randomized methods in numerical linear algebra. [SIAM Online and arXiv]
  • CUR matrix decompositions for improved data analysis by Mahoney and Drineas
    This PNAS paper provides a brief introduction to the CUR factorization, some applications, and references to the more detailed works in the area. [PNAS Online]
  • LSRN by Meng, Saunders, and Mahoney
    This paper presents the LSRN method, an iterative solver utilizing randomness that solves least squares problems. [Paper and code]
  • Blendenpik by Avron, Maymounkov, and Toledo
    This paper presents the Blendenpik algorithm for overdetermined least squares problems. [Paper]
  • Low-rank approximation and regression in input sparsity time by Clarkson and Woodruff
    This paper presents and analyzes the sparse random matrices we discussed in class. [Journal and arXiv]

Schedule


A tentative schedule follows, and includes the topics we will be covering and relevant reference material (More details on the listed papers are given in the References section). Scribed notes provided here are written by students in the class and, while peer reviewed by

Date Topic References Notes/assignments
1/21 Introduction
1/23 Intro to randomized numerical linear algebra SIAM review paper by Martinsson, Halko, and Tropp and Monograph by Martinsson and Tropp
1/28 Randomized NLA, range finding algorithms SIAM review paper by Martinsson, Halko, and Tropp and Monograph by Martinsson and Tropp
1/30 Randomized NLA, proofs SIAM review paper by Martinsson, Halko, and Tropp and Monograph by Martinsson and Tropp
2/4 Randomized NLA, proofs SIAM review paper by Martinsson, Halko, and Tropp and Monograph by Martinsson and Tropp
2/6 Randomization for least squares problems LSRN paper by Meng, Saunders, and Mahoney
2/11 Randomization for least squares problems Blendenpik paper by Avron, Maymounkov, and Toledo
2/13 Low-dimensional embeddings Monograph by Martinsson and Tropp
2/18 The FFT
2/20 The NUFFT
2/27 The FMM
3/2 The FMM
3/3 Rank-structured matrices
3/5 Rank-structured matrices
3/10 Rank-structured matrices
3/12 Rank-structured matrices
3/13 - 4/6 Classes suspended
4/7 Class resumes, virtually
Topic Actual sparse matrices
Topic Sparse recovery problems: compressed sensing, matrix completion, etc.
Topic Special topics!
3/27 Project proposal due at 5 PM Project proposal due at 5 PM
5/14 Project due at 4:30 PM Project due at 4:30 PM

Notes


Scribed notes will be added here throughout the course; these notes undergo student based peer review, but have not all been completely checked for correctness. Please email the instructor if you believe to have found any errors so they may be corrected.

Lecture 1 — introduction and course overview, no notes

Lecture 2 — Rand NLA [Notes]

Lecture 3 — Rand NLA [Notes]

Lecture 4 — Rand NLA [Notes 1, Notes 2]

Lecture 5 — Rand NLA for Least Squares [Notes]

Lecture 6 — Rand NLA for Least Squares [Notes ]

Lecture 7 — Subspace embeddings [Notes ]

Course policies


Participation

You are encouraged to actively participate in class. This can take the form of asking questions in class, responding to questions to the class, and actively asking/answering questions on the online discussion board.

Collaboration policy

You may discuss the homework and projects freely with other students, though any work you turn in must be your own and provide appropriate citations to existing work, implementations, etc. if they are used.

Academic integrity

The Cornell Code of Academic Integrity applies to this course.

Accommodations

In compliance with the Cornell University policy and equal access laws, I am available to discuss appropriate academic accommodations that may be required for student with disabilities. Requests for academic accommodations are to be made during the first three weeks of the semester, except for unusual circumstances, so arrangements can be made. Students are encouraged to register with Student Disability Services to verify their eligibility for appropriate accommodations.