|
CS4414: Schedule of Lectures
All lectures will be delivered in person, and attendence is important -- the
slide decks don't really cover all the details we discuss in lecture.
The video links are mostly from the Spring 2023 offering of the course, but a
few of those videos had issues and in those cases, are to the 2021 offering.
Lectures do change over time, and the video quality is not the very best.
This is why it is a problem to skip lecture and learn by playing the videos:
if you miss a single lecture and need to catch up, that will be fine.
But if you try to learn the whole course from old videos, you may find that
you fall behind relative to people who attend in person.
An important note about the books: BO (Bryant and
O'Halloren) is used for
multiple courses (they explain this in the preface), and hence explores topics
not included in CS4414. Moreover, they use C for examples. Despite
this, we picked the book because we like their coverage of the topics we talk
about, and C is a subset of C++, so their examples work in C++ too. If a
section doesn't look like something we talked about in class, just skip that
section. If uncertain, just ask us (for example, on Ed Discussions). BS
(Bjarne Stroustroup) is a fantastic but very focused text specifically on C++.
We don't have a separate Linux book, because all of that material is online.
You will also find the C++ reference website (cppreference.com) useful.
Our broad topic is focused on
programming "the system", as distinct from writing a program to solve
some single problem.
To tackle this topic, we'll need to understand the system that
we are controlling: the system we are programming. So in
particular, although CS 4414 looks closely at Linux, the bash shell and
commands you can launch from it, and at C++ 20, this is
not actually a
class focused on teaching you Linux, bash or C++ -- we will show you
aspects of all three, but we expect you to learn these yourself (with
our help and with various pointers we provide).
Lectures 1-7 look at
big issues we need to be thinking about: various types of resources
(memory, CPU cycles, files...), costs and opportunities they expose, and
the many forms of concurrency we see in a modern system, where the
operating system might be hard at work, the file system doing various
things as well, and where your own code could also be running in
multiple processes or using multiple threads. Running through all
of this are some unifying concepts that we will revisit throughout the
entire semester. |
1. Aug 27 |
Introduction |
pptx
pdf
video (2023 version) |
We will explore the different kinds of costs seen with standard
languages and systems. BS Chapters 1-3.
I'll
show some code examples (including one I wrote that isn't using a very
pure C++); you can see the actual code here.
We recommend coding the way Sagar (Sagar Jha was the first CS4414 TA
from back in 2020) does, which is here.
A wonderful TA wrote versions in Python and Java for us in 2020 (Lucy
Lu). Lucy's code is here and
here. Notice how few lines
Sagar's C++ code needed, or Lucy's code in Python. Ken's code was
faster, but used C instead of pure C++ (this was back before Ken became
as much of a C++ wizard as Sagar), dives much closer to the hardware and
O/S architecture to take advantage of very low-level features, and
needed way more code (plus, he used a style of code that can easily lead
to bugs, although in fact his code wasn't buggy). What can we
conclude about these language choices?
As a side-remark, after Sagar saw that Ken's "ugly" version was fastest,
he modified his more elegant version and gained most of the speed.
In fact the version you are seeing is Sagar's version 2.0 -- his version
1.0 was really very slow, about 10x slower than Ken's. Does
this tell us anything about coding for speed?
There is a famous adage: "Build one to throw away". What do we
learn from version 1, and why do we throw it away instead of just
improving it "in place"?
And just for completeness, there is a third adage to learn too.
Butler Lampson, who won a Turing Award for his insights into how to
build efficient, clean operating systems software, is famous for
complaining about how version 3 (of anything) is always worse than
version 2: Full of useless features, often slower, and often full of
what could be thought of as the code equivalent of a rhetorical flourish
("fancy language that serves no purpose"). What would be some
examples of things a word counter program could do that would be in
Butler's category -- features where Butler would react by saying "Yes,
you could do
that. But why in the world would you want to?" |
2. Aug 29 |
Architecture |
pptx
pdf
video (2023 version) |
Modern computers use a NUMA architecture. What does this mean
and why does it matter? Skim BO Chapters 1-4, but see note above! |
3. Sept 3 |
Concurency |
pptx
pdf
video |
There are many ways to harnass the power of a NUMA machine. In
Linux, people used to refer to this as "multitasking", and we'll look at
that idea (for example, running a few Linux commands at the same time).
But we'll also start to touch on ways that we can use parallelism within
our own programs. This will become a big theme for the class that
runs through the entire semester. |
4. Sept 5 |
Linux |
pptx
pdf
video |
We'll dive a bit deeper to gain an appreciation of what Linux is
doing, and how it can actually be "programmed" through scripts that
treat Linux commands as programmable components. BO Chapter 7 has some
useful context, but in fact much of this lecture is focused on Linux
commands, and the bash and Linux help documents and user manual are the
best sources of details here. You should definitely read about
these, and try them! |
5. Sept 10 |
Memory |
pptx
pdf
video |
Data lives in memory, but in a Linux system,
even the basic idea of a memory segment turns out to be a very
sophisticated concept. We'll have a look. BO Chapter 6,
Chapter 9. Note: The file for the 2023sp video somehow
became corrupted, so this will be the fall 2021 version of that same
lecture. |
6. Sept 12 |
Systems abstractions |
pptx
pdf
video |
Dijkstra was the first to suggest that an operating system
"abstracts" the hardware, and that this concept could be carried forward
to create a hierarchy. We will discuss concepts of modularity at
the level of larger components of a system. BO Chapter 9. |
7. Sept 17 |
Exceptions |
pptx
pdf
video |
Everyone is familiar with "normal" control flow. What happens
when something abnormal occurs? Interrupts, signals and C++
are all examples of what we call exceptions. They
illustrate Dijkstra's idea of taking seemingly disparate concepts and
unifying them through an abstraction that spans their different
functionalities. BO Chapter 8. |
A core concept in systems
programming is the idea of taking control of all elements of the
system.
In a modern computing platform like Linux, all sorts of elements are
programmable or controllable, but you as the programmer often need to
understand the computing features you are trying to control and you
often express that control in somewhat "indirect" ways, for example by
coding in a particular style (lecture 8, 9), by "reprogramming" the
compiler to generate code for your classes that reflects your own logic
and intent (lecture 10), or even by introducing extra layers that
redefine seemingly straightforward features (lecture 12).
In the following lectures we work our way up to this realization, but by
lecture 11, on Performance analysis, we see the full picture: if you as
a developer can properly visualize your full system, you can use Linux
tools to confirm or refute your intuition about where time is being
spent, and ultimately use this subtle control features of Linux, the
hardware, the compiler, the linker, etc. In fact even the hardware
is often programmable! |
8. Sept 19 |
Hardware Parallelism |
pptx
pdf
video |
In modern systems, the key to speed is parallelism. We
obtain concurrency with threads, but the hardware is also capable of
true instruction-level parallelism: so-called SIMD computing. How
can we help the compiler find opportunities for SIMD parallelism in a standard program?
Again, the key turns out to center on viewing the idea of hardware
parallelism as an abstraction -- a conceptual building block. BO Chapter 5. |
9. Sept 24 |
Constant Expressions |
pptx
pdf
video |
This lecture starts a new unit that dives deeper on sophisticated
C++ concepts. We'll begin with a deeper dive into the C++ concept
of constants, which have come up in our previous lectures and especially
in lecture 9. Constants and constant expressions are evaluated at
compile time, and this ide has been taken exceptionally far by modern
C++ compilers, to the point that the language has a whole compile-time
sublanguage!
BS Section 1.6.
We will also discuss a peculiar behavior seen with modern compilers when
they encounter code that includes some form of undefined behavior, like
an explicit floating point divide by zero. C++ and many
other languages might delete the undefined code block as part
of its constant-expression compilation step! This can be a real
surprise, and is important to understand. Learn more
here. |
Sept 26 |
Prelim 1, in person.
7:30pm, 75m exam but we allow up to 2 1/2 hours |
OLH155, OLH165, OLH255 |
The exam will be closed book, no use of
Internet resources. But we will allow you to bring one page of
personally created notes. A page means 8.5" x 11", two-sided, and
any font is fine with us (handwritten is ok, too). |
10. Sept 26 |
Templates |
pptx
pdf
video |
C++ abstract types are a compile-time concept, implemented by
templates. Understanding them is a key step in gaining real
proficiency in C++. BS chapter 4, 6-8. |
11. Oct 1 |
Performance: Big Picture |
pptx
pdf
video |
We've learned a lot about C++, the file system, NUMA hardware and
memories, and hardware parallelism. Performance can feel like a
juggling act! In this lecture we'll discuss the big picture for
achieving high speed in real programs. |
12. Oct 3 |
Performance |
pptx
pdf
video
|
Our emphasis has been on achieving the utmost
in speed. Can we measure performance? How would a developer
zero in on the performance-dominating portion of a program? BO
Chapter 5. BS Chapter 5.
I should probably comment that even though the slides for lecture 11
seem to focus on gprof, the topic really is a higher-level question.
Tools like gprof (and Linux has many of them) are awesome, but they
can't even come close to what you need to learn to do at a
"visualization" level: trying to visualize how something is executing,
and using the insights from doing that to either confirm (or refute)
your theories by running one of the tools and seeing if the output
matches your expectations. So the theme of stepping back and
controlling the entire system (from lecture 10) becomes a theme of
visualizing the whole system and using performance tools to focus in on
parts that are surprisingly slow -- that deviate from what you expected. |
13. Oct 8 |
Linking |
pptx
pdf
video |
When an application is compiled, the compiler transforms the code
into an executable. How are libraries handled, and what happens at
runtime when a program that uses DLLs is actually launched? This
topic straddles the boundaries in an interesting way: It has aspects
that relate to the application in C or C++, aspects that relate to
Linux, and aspects that introduce a new kind of abstraction, which even
creates new "super powers"! BO Chapter 7. |
Our next
large module looks at threads and thread-level synchronization.
More broadly, our focus is on multiprocessing: situations in which more
than one element of a system cooperate to carry out some task, using a
mixture of methods: multiple threads, multiple processes, perhaps
multiple computers, perhaps even attached hardware accelerators that are
themselves programmable. |
14. Oct 10 |
Threads |
pptx
pdf video |
We've seen threads in a shallow way, but this
lecture starts a much deeper unit on thread-level concurrency.
Creating threads, distinction between lightweight threads and threads
with a dedicated CPU. Lambda notation in C++. Taskset command. BO Chapter 12, BS
Chapter 15 |
Fall break Oct 12-16 |
15. Oct 17 |
Synchronization |
pptx
pdf
video |
Race conditions, critical sections, mutual exclusion, locking.
BO Chapter 12, BS Chapter 15 |
16. Oct 22 |
Monitors |
pptx
pdf video |
The monitor pattern is a powerful tool for structured control of
thread-level concurrency. BO Chapter 12, BS Chapter 15 |
20. Oct 24 |
Deadlocks |
pptx
pdf
video |
Deadlocks, livelocks, how to prevent them. How systems that
are at risk of deadlock deal with this (abort/rollback/retry).
Priority inversion. BO Chapter 12, BS Chapter 15 |
18. Oct 29 |
Coordination |
pptx
pdf
video |
Software design patterns and how this idea can be carried over to
coordination patterns in modular applications: Barriers, leader-workers,
ordered multicast, all-reduce. BO Chapter 12, BS Chapter 15 |
19. Oct 31 |
Multi-Process Systems |
pptx
pdf
video |
Many modern systems are big, created by teams and split into
multiple processes that sometimes even run on different computers.
How do the forms of sharing and synchronization we have learned about
apply in such settings? What approaches have emerged as the
winners for these big systems? |
20. Nov 5 |
Sockets and TCP |
pptx
pdf
video |
In lecture 18, we heard about networking. This lecture will
look at how Linux exposes networking via the sockets API.
TCP and Google GRPC.
How a web browser uses these ideas. These topics aren't covered in
the textbook, but you will find a lot of information
here and
here. |
21. Nov 7 |
Two modern networked file systems |
pptx
pdf
video |
In lecture 18, we heard about the idea of accessing a file system
remotely over a network. Today, we'll discuss two modern networked file
systems. Ceph is an "object oriented" file system. Zookeeper is a file
system used as a coordination tool.
At the end of the lecture we heard very briefly about the complexity of
the distributed security infrastructure used to negotiate the needed
encryption keys for all of this.
This story from The Register (a UK newspaper) illustrates the snarl
of issues and how they sometimes lead to really peculiar bugs or
performance problems. |
22. Nov 12 |
Prefetching and Caching |
pptx
pdf
video |
Why aren't remote file systems slow? This may feel like a dumb
question, but when you consider that companies like Facebook are serving
content globally with millisecond response times (and what is Facebook
beyond a social network graph stored in a fancy file system?) you'll
realize this is WAY more tricky than it seemed at first glance.
Hint: It isn't just about the network. |
23. Nov 14 |
Key-Value Stores |
pptx
pdf
video |
We talked about networked access to a file
system in November. In today's lecture we'll talk about the MemCached concept and
how modern systems manage really big data sets. |
24. Nov 19 |
Transactions |
pptx
pdf
video |
Classic distributed computing problems. We
will talk about the transactional model and the 2-phase commit
problem that arises. Then we will combine this with a different concept
called 2-phase read-write locking, and will see that we can implement
the transactional model using these building blocks. This is a powerful
and popular way to build strongly consistent distributed systems (not
the only way, and if you take CS5412 in the spring, you can learn about
others). |
The final module of the course
focuses on security. Cornell has entire courses on this topic, so
we limit ourselves to aspects of security directly tied to the systems
programming concepts we've seen in lectures 1-23. |
25. Nov 21 |
FaRM: RDMA-accelerated
transactions on a key-value store. |
pptx pdf
video |
This lecture puts the ideas from lectures 22 and
23 together and also offers a glimpse of the cutting edge: we will see
how Microsoft used RDMA network accelerators to achieve blazing performance of a transactional key-value service
called FaRM. FaRM is used as a component of Microsoft Bing, their
main cloud search solution. Read more about FaRM
here. |
26. Nov 26 |
Security risks in C++ and Linux |
pptx
pdf
video (2021 version) |
We will start by discussing classic ways of attacking applications on
Linux systems, often via networking APIs that don't adequately check
length limits on incoming objects. BO Chapter 3, especially 3.10
Note: This lecture will probably be via Zoom |
Thanksgiving break: Nov 27-Dec
2 |
27. Dec 3 |
Linux Protection Mechanisms |
pptx
pdf
video (2021 version) |
Having seen a form of hacking attack, we'll continue our pivot and
look at other security and protection mechanisms. Access to a
resource must be authorized, and the party performing the access must be
authenticated. How Linux handles this. |
Prelim 2: December 5.
7:30pm, 75m exam but we allow up to 2 1/2 hours. Rooms TBD
|
28. Dec 5 |
Access control abstractions |
pptx
pdf
video (2021 version) |
Access control abstractions and the Linux options we use to actually
limit access and control the flow of information within a machine, or
between machines. |
|