CS5412: Topics in Cloud Computing
Gates Hall room G01, 1:00pm-2:15pm on Monday and Wednesday. Recitation on Wednesday at 7:30pm-8:45pm, same room
We recommend that you attend all
lectures in person, but will also post recorded versions,
with closed captions, on the syllabus page. Students
who don't attend in person generally do poorly in this
course because the exams focus on material covered in class,
and it can be hard to learn from a video
Prof. Ken Birman, 435 Gates Hall, x5-9199.
|Ken in Person
|After class M,W
|Gates Hall, Room 435
TA Office hours:
|After recitation, hence Wednesday 8:45 - 9:45 PM
|Gates G01 or a table outside in the hallway
|(Skills demos) Tuesday 12-1pm
Ed Discussions: Find the 5412 discussion board here.
Assignments and graded materials: Find them in CMS.
What is this course about? Cloud Computing is an overarching term that covers modern computing infrastructures to support the web: browsers and web servers, as well as ways of building mobile clients, scalable web services, and very fast infrastructures for serving up content in geographically distributed systems that might include dozens of data centers and millions of computers. Everything we do is cloud-based or uses cloud solutions these days. CS5412 teaches you to use one of the main clouds (Azure), while also learning transferable insights about the fundamentals of how these systems work. That deeper perspective will be equally useful when working with other major cloud platforms.
The cloud is a huge space within which there are many trends. Two that interest us in CS5412 involve real-time data streams that enter the cloud from sensors or other devices, or from mobile platforms that interact with 5G computing and connectivity hubs. We will look closely at the "data path" by which one connects a sensor to a cloud computing account, uploads data as new events occur, or continuously, then processes the data using various kinds of tools. Many data science courses teach you to upload existing data sets and then work with them; our course is more oriented towards event-by-event scenarios, where data needs to processed as it is generated and may have to trigger some form of immediate ("real-time") reaction. This makes the course rather hands-on, with a fair amount of programming -- mostly in the form of small functions written in Python or other languages using cloud APIs, but sometimes involving larger coding tasks that you would carry out in your favorite programming language.
When we combine cloud intelligence with IoT and 5G applications, we end up with smart things: smart power grids, smart farms, smart homes, smart cities... In CS5412 projects, you'll pick a topic from one of these areas (or something related), and will create apps for those kinds of settings. Every student will do a project (either on their own or one we suggest) involving prototyping some form of smart something.
Let me give an example. Suppose that you decide you want to do a project on a topic like building a cloud-based AI service to intelligently firewall some sort of organization like a hospital. You've found trace data on the web and decided to use this as a way to simulate a real hospital, and you plan to build smart policies aimed at avoiding accidental insecure transfer of private patient information.
This would be an awesome project topic, by the way, but on the hard side (and I don't know where you would find the trace file!). Assuming you do find a trace, very suitable for someone looking to also get MEng project credit.
Anyhow, you have these log files with your trace data, copied from somewhere, and they contain 10M records (representing 10M messages seen by some actual firewall). Obviously, to "simulate" the real application, you'll need to replay this data, or at least some of it. So one part of your project would be to build a piece of software to read one message at a time from the trace, then send it to the Azure cloud. You happen to be a Python wizard, and our TAs tell you about something called FLASK, and you end up finding a sample program for uploading data from a Python program in FLASK into the cloud, but it demos a line-by-line scenario. You change it to send messages in whatever format the trace was using. The cloud responds by saying "block it" or "let it through" for each message.
You can see that this involves lots of programming: coding this Python program and reading data from the trace and figuring out how a FLASK container works, building one from the sample, modifying it, rebuilding it, testing it... and now you know how to simulate the world of your firewall.
Now, was it important that you did this in Python/FLASK?
Not at all! It could be in C with gRPC,
fine. But the thing is... our CS5412 TAs won't be
teaching you any of those things. We do teach you how
it has to work, but we need you to implement this code.
What about those demos of "how to do such-and-such on Azure"?
Microsoft has a huge number of them, in open source, with
excellent documentation. If you have the basic
skills, they bridge you to the specific ways of using them in the cloud.
And with this you can be insanely productive in the modern cloud!
You write a few lines of code and voila! An ML
does message classification for you, one you trained using a few cloud
commands (and a trace -- with no data to train on,
this particular project wouldn't be very feasible).
So with those basic skills, you will do great in this class.
Wait List. In fall 2022, most students will initially need to go onto a wait list. Then, during the add/drop period, we will send you an enrollment PIN you can use to finalize your enrollment in the class. There is no capacity limit in fall 2022, but we do take the background aspect seriously and some people may be asked to drop the course if they enroll, but in fact lack the required background.
Attending class. Many studies show that watching a class on videos from home is not effective. Please attend in person, then use the videos as a catch-up aid. Don't assume that you can skip class and do just as well working from home.
Videos of lectures. Ken will post video recordings of all lectures.
Exams. Grading is 50% exams, 50% project. The current plan is that exams will be in person, one prelim and one final. Dates and location to be announced, but we might use recitation slots as exam slots if the University assigns us really wierd dates and rooms. The exams focus on topics we covered in class and we will provide old exams that you can use as study and practice materials. Exams will switch to being at-home if the covid situation makes an in-person test unwise.
Projects. Small homeworks and your project add up to 50% of your course grade. Read the Project Options web page to learn more.
MEng Projects. Some students expand their CS5412 project into an MEng project. See the separate web page about CS5999 and the FAQ. It is important to understand that an expanded project require much more effort than a regular project -- you should schedule your semester to include six or eight hours per week of additional effort, for which you will get CS5999 credits (your grade in the main course and in CS5999 will be identical). To do this, you will need to submit a CS5412 MEng project plan that details the extra effort you plan to invest, and we will be meeting with you from time to time and tracking progress on the project (including those extra aspects that justified it being counted as an MEng project) throughout the semester. In your final written report, which is required, you will need to document both what you accomplished and also your personal investment of effort, by telling us precisely which parts of the solution you personally created.
|Syllabus for Spring 2022
|Prelim study guide
|Cloud computing accounts
|TextBooks (not required)
|EdStem Discussion Site