Topics in Cloud Computing
Prof. Ken Birman, 435 Gates Hall, x5-9199.
Ken's office hours: After each lecture, we will keep the Zoom link open and you are welcome to hang around and ask questions as long as needed. But due to online mode of instruction, students have very irregular hours and posting questions on our discussion board is often easiest. Ken is also happy to do online office hours by appointment. Just email and we can find a time! Hopefully, we'll be back to normal in-office office hours before the end of the semester.
TA Office hours (online):
Yifan: Monday, Friday
5PM - 6PM
Ed Discussions: Instead of Piazza we are using Ed Discussions (much better!) Log in to access it at CS5412 online discussions.
Assignments and graded materials: Find them in CMS.
What is this course about? Cloud Computing is an overarching term that covers modern computing infrastructures to support the web: browsers and web servers, as well as ways of building mobile clients, scalable web services, and very fast infrastructures for serving up content in geographically distributed systems that might include dozens of data centers and millions of computers. Everything we do is cloud-based or uses cloud solutions these days. CS5412 teaches you to use one of the main clouds (Azure), while also learning transferable insights about the fundamentals of how these systems work. That deeper perspective will be equally useful when working with other major cloud platforms.
The cloud is a huge space within which there are many trends. Two that interest us in CS5412 involve real-time data streams that enter the cloud from sensors or other devices, or from mobile platforms that interact with 5G computing and connectivity hubs. We will look closely at the "data path" by which one connects a sensor to a cloud computing account, uploads data as new events occur, or continuously, then processes the data using various kinds of tools. Many data science courses teach you to upload existing data sets and then work with them; our course is more oriented towards event-by-event scenarios, where data needs to processed as it is generated and may have to trigger some form of immediate ("real-time") reaction. This makes the course rather hands-on, with a fair amount of programming -- mostly in the form of small functions written in Python or other languages using cloud APIs, but sometimes involving larger coding tasks that you would carry out in your favorite programming language.
When we combine cloud intelligence with IoT and 5G applications, we end up with smart things: smart power grids, smart farms, smart homes, smart cities... In CS5412 projects, you'll pick a topic from one of these areas (or something related), and will create apps for those kinds of settings. Every student will do a project (either on their own or one we suggest) involving prototyping some form of smart something.
Background. To take CS5412 you must show us that you have the needed background from other courses, or from equivalent real programming experience. CS4414 (Systems Programming) is an ideal source of background for CS5412. Other useful courses can include Operating Systems, Databases, Networking, Distributed Systems and Big Data Analytics. You should be able to work with your favorite programming language (Python is probably the most popular, but anything will work). We will be working with the Azure cloud, because Microsoft is offering us all sorts of help getting accounts, learning to use their cloud, and even sponsoring a "hackathon" where you can acquire cloud skills and win big prizes (seriously!) over a weekend with their experts guiding you on the technology. This is a unique way to learn by doing, and we strongly encourage our course members to participate.
Wait List. In Spring 2021, most students will initially need to go onto a wait list. As part of this you will fill out a small survey explaining how you got your background (a "Google Form"). Then, during the add/drop period, we will review your form and send you an enrollment PIN. With this, you can join the class. There is no capacity limit in spring 2021, but we do take the background aspect seriously and some people may be refused due to a lack of preparation.
Exams. We will have take-home quizzes a few times during the semester. They add up to 25% of your course grade. These are released from CMS, and then must be completed within a fixed amount of time (typically two hours) after you download them. Quizzes are open book and open notes, but you must not get any form of help, and you are not permitted to post materials (i.e. on Chegg or Course Hero) that could help anyone else.
Extra Credit: Each semester we offer a few ways to get some extra credit that can boost your grade a little. In Spring 2021 these include participating in the CIDA Hackathon and participating in the BOOM project fair. In both cases you have to be there to the very end to get credit.
Projects. Small homeworks and your project add up to 75% of your course grade.
The unifying theme is that all projects must have
some form of automated intake of live data, all must be designed to leverage
the highly scalable platform architecture of the cloud (mostly, the Azure
cloud, although we don't mind AWS projects if you want to use AWS and deal
with getting the needed academic accounts on your own), all should have a
mix of data management and computation with prebuilt AI/ML tools or big-data
tools. The projects that get the top grades should continously capture
streams of IoT data when you demo them to us. Weaker projects that
upload a file or data from a smartphone on command will get lower grades,
and projects that don't upload data in an automated, software-controlled way
will get low grades -- this is not an AI or big-data class, and we
need to see that your project is using the concepts we learned in class.
Every project will need to include evaluation of performance, scalability and fault-tolerance. At a minimum this would mean testing and measuring the end-user experience as the number of users increases or when things are disrupted by a fault. The most exciting projects often introduce extra services into Azure, such as smart code running on Cornell's Cascade server. If you do this, you'll also test that your smart service is fault-tolerant and scalable.
Some students will work on projects recommended by us. A few are digital agriculture projects coming from Cornell's "smart dairy" classes, and others involve working with some Cornell technology that we are adding to Azure and would like some help on. The dairy projects have associated data sets (they mix streaming data from a farm with AI models trained on training data sets). Azure itself has extensive resources that can combine with these, such as maps, weather predictions, etc.
Other students prefer to have a project of a more self-defined kind. For example, entrepreneurial students sometimes use CS5412 as a setting in which to develop a cloud-hosted prototype for their future startup. As long as the application involves some form of live data that would flow in at scale from lots of users, we are open to seeing ideas like this.
Most students work in groups of two or three CS5412 people, plus perhaps collaborators from outside CS5412, especially in the case of dairy (smart farming) projects. A few work by themselves. We do not approve groups of size larger than three at the start of the semester.
It is important to understand that an expanded project require much more effort than a regular project -- you should schedule your semester to include six or eight hours per week of additional effort, for which you will get CS5999 credits (your grade in the main course and in CS5999 will be identical). To do this, you will need to submit a CS5412 MEng project plan that details the extra effort you plan to invest, and we will be meeting with you from time to time and tracking progress on the project (including those extra aspects that justified it being counted as an MEng project) throughout the semester. In your final written report, which is required, you will need to document both what you accomplished and also your personal investment of effort, by telling us precisely which parts of the solution you personally created.
Attending lectures. You are required to attend every lecture and recitation (you can attend in real-time, or can stream the recording asynchronously). Quiz questions are often based on material covered in lecture. Also, when we grade your project, your will get a grade that depends in part on how well you turned what we teach in class into elements of your project solution. These aspects quickly add up; in the past, people who didn't attend lectures often got grades in the C's, whereas students who attended lecture generally got grades in the A's and B's.
|FAQ||Syllabus for Spring 2020||Project Options||Recitation||Prelim study guide||CS5999 Info|
|Cloud Resources||Cloud computing accounts||TextBooks (not required)||EdStem Discussion Site||Sample Prelim|