|
Shared
Lectures (Tuesday/Thursday) |
|
CS4414 Friday |
CS5416 |
Homework: first few weeks are shared homeworks |
||
|
8/26 |
1. Introduction, rules
for using AI code generation in this course |
8/29 |
C++ classes, names and scopes |
|
||
|
8/28 |
2. Multicore concurrency
and how it can conflict with the NUMA memory model (false sharing, lock
contention, unfair mutex) |
|||||
|
9/3 |
3. How did we settle on C++ for this course? Should we all switch to Rust? |
9/6 |
Copy versus
reference, const |
|||
|
9/5 |
4. Building abstractions
that simplify advanced systems designs and development.
The file system is an abstraction.
POSIX file I/O API |
|||||
|
9/9 |
5.
Linux
segments, DLLs, page remap, page protections |
9/12 |
C++ templates |
Extending our binary tree into an approximate nearest neighbor tree, used heavily in RAG LLM and LRM systems. Leaderboard:
most performant solution to the ANN task.. |
||
|
9/11 |
6. Compile time evaluation: constants, constexpr, templates. |
|||||
|
9/16
|
7.
|
9/19 |
Multithreaded
programming workshop |
|||
|
9/18
|
8.
|
|||||
|
9/23
|
9. Abstractions for safe concurrency and
thread-to-thread coordination: circular buffers, readers+writers |
9/26 |
Designing FarmVille |
Farmville.
This is a simple old-style graphical application in which
application threads animate little objects in a scenario involving
making cupcakes at a bakery that sells to local students and sources
ingredients from a local farm and from a regional wholesaler.
The main focus is on designing and implementing the needed
concurrency control to avoid collisions.
Leaderboard: solutions with the best scalability of
overheads as a function of the number of concurrent threads.
|
||
|
9/25 |
10. Deadlocks, livelocks: four conditions, ordered locking |
|||||
|
Prelim 1:
Evening exam, 9/25. OLH155, OLH255.
Designed as a 75m exam, but we will have the rooms for at least
twice that long. |
||||||
|
9/30 |
11. Accessing collections, CCL primitives |
10/3 |
Debugging tools: gdb,
Valgrind, gprof |
|||
|
10/2 |
12. Theoretical models for distributed computing, the concept of distributed consistency. |
|||||
|
10/7 |
13. How inconsistency in a fault-tolerant multicast ended a project to redesign the US air traffic control system. |
10/10 |
Compiler+instruction
set+architecture ecosystem |
|||
|
10/9 |
14. Can MLs benefit from fault-tolerance consistency models without being recoded from scratch? |
|||||
|
Fall break: Oct
11-14 |
||||||
|
10/16 |
15. Understanding (and debugging) performance. |
10/17 |
4414:
single process optimization track |
5416:
multiprocess+GPU |
4414:
single process homework on csug lab |
5416:
distributed homework on MEng lab + fractus |
|
10/21 |
16. Client connectivity to the cloud. VPNs and VPCs. |
10/24 |
TBD |
TBD |
|
|
|
10/23 |
17. Facebook CDN and caching |
|||||
|
10/28 |
18. Facebook's social network graph, TAO |
10/31 |
|
|||
|
10/30 |
19. Cloud Microservice Frameworks |
|||||
|
11/4 |
20. Availability zones and data redundancy support |
11/7 |
|
|
||
|
11/6 |
21. Apache technologies - I |
CS4414 You will design a RAG system from scratch, encoding documents, indexing them with FAISS, and retrieveing top-K context for queries. Then you’ll use llama.cpp to generate a text reply to each query. The project will conclude with a benchmarking study to identify bottlenecks and assess design choices. Leaderboard: single machine performance. |
Understanding and optimizing performance in PreFMLR: An ML for retrieving documents relevant to text queries at high speeds. This project has been discussed in recitations a few times. Leaderboard: distributed performance. |
|||
|
11/11 |
22. Apache technologies - II |
11/14 |
|
|
||
|
11/13 |
23. Spark RDD concept, compiling to MapReduce for Big Data analytics |
|||||
|
11/18 |
24. Vector databases: Approximate document retrieval for RAG MLs |
11/26 |
|
|
||
|
11/20 |
25. GPU accelerators |
|||||
|
11/25 |
26. Performance lessons for ML systems |
|
|
|
||
|
11/26-11/31
Thanksgiving break |
|
|||||
|
12/2 |
27. More details on RDMA |
Lectures 27 and 28 are not included on prelim2. There will be no Friday recitation this week, but we will have coding help sessions for HW3. | ||||
| Prelim 2: Evening exam, 12/2. OLH155, OLH255. SDS accommodated students will take this exam at the same time, but at the ATP center operated by SDS. | ||||||
|
12/5 |
28. The Rocky path to RoCE deployment at Microsoft. |
|||||
|
We have no final
exam. Final projects must be
submitted no later than midnight 12/10. If you submit by
12/5, we will post a letter grade by 12/13. |
||||||