CS5412: Foundations of Cloud Computing (Spring 2016)

Final Exam: Thursday May 19, Hollister Hall B14, 7-9pm.  Open book, open notes including course web pages.

Prof. Ken Birman, 435 Gates Hall, x5-9199


Syllabus, Slides, other Course materials: click here.   Project info: here

What is this course about?  Cloud Computing is an overarching term that covers modern computing infrastructures to support the web: browsers and web servers, but also ways of building mobile clients, scalable web services, and very fast infrastructures for serving up content in geographically distributed systems that might include dozens of data centers and millions of computers.

CS5412 tackles this very large topic in a way that strikes a balance, looking at the large-scale architecture of modern cloud computing systems while simultaneously exploring theoretical underpinnings, limitations, and ways of guaranteeing the quality of cloud-hosted applications.   The projects expose students to some examples of practical cloud computing tools and solutions.  

Who should take this class?  CS5412 aims at MEng students who are curious about just how far cloud computing can be taken, and to have a deeper insight into the cloud and who like material to have a balance of theory and practice.   The course is for students who are already strong programmers with plenty of experience writing systems software, and who want to take the next step and start to use cloud computing technologies.  This is not a suitable course for programmers who are just getting started and who lack that kind of experience. 

What background is required?  CS5412 assumes that you have a solid background doing systems programming, networking and operating systems, and have taken CS architecture and theory classes.  You don't need to have taken them at Cornell, but you should have this background.

CS5412 includes a project that has a single basic cloud computing core that everyone will need to implement, and then a customizable "front end" that can be specialized in lots of ways, and where we will expect you to decide what you want to do.  We recommend that you do the project in a group of 3 or 4 people.  We find that smal groups get more work done, they generally seem to experience less stress and have a better time because team members can help each other out, and the front-ends are often more interesting.  A few times we have seen student groups stick together after the semester and go on to launch successful companies! 

People who know absolutely nobody at Cornell can link up using Piazza, which has a team-member search feature.  People who hate to work in groups can talk to us and in special cases like that, we might allow individual projects.

It is important to understand that the class won't teach background skills, and we won't even be teaching you the detailed API for using the cloud programming tools that arise in our projects.  Instead, we expect you to be a self-learner who can teach yourself what you need to know, filling in gaps in your skills set.  For example, perhaps you already know C++ well, but are unfamiliar with the C++/CLI system that Microsoft uses on .NET.  Rather than us teaching this to you, we would you to teach yourself about the "hat" operator in C++/CLI and thus be able to use C++/CLI with .NET.  There is a great deal of online documentation and even instructional videos for each of the technologies you would need to use in your projects.  Our TAs and instructor are always available to give pointers, but you will need to do a fair amount of reading and out-of-class work to complete the projects and even to really learn material covered in class in a deep way.

Newsgroup for posting questions: We'll be using Piazza.

More complete synopsis: Not many people realize it, but the area known as cloud computing was actually invented here at Cornell!  Or at least partly invented here.  We were early leaders in building larger and larger clusters and management tools (for example, Cornell software plays key roles in Microsoft's "Longhorn" cluster management tools, and in Oracle's network management tools for their database products, and was the basis for the event notification layer that was used in IBM WebSphere for ten years, to name just a few examples).  Then when Werner Vogels, who was a researcher here, joined at Amazon's new CTO in 2006, exactly the period when that company invented modern cloud computing and the area went crazy!  If you want to learn about cloud computing in 2015, this is the place to study that topic.

Cloud computing has become a major commercial area for Internet product development and activity, so it can be surprising to realize that the term itself has many uses.  We use the cloud whenever we search the web, post a photo to Facebook, or use the mobile version of Google Maps for driving directions.  Cloud computing enables a new kind of computation in which staggering amounts of data can be culled from sensors world-wide and then employed as the basis for problem solving in new styles that need to also be massively parallel, since the data ends up spread over large numbers of machines, with no single machine having more than a "shard" of the big picture.   Many users think of the cloud as the ultimate "rent-a-machine" computing solution: as many virtual computing nodes as you might care to pay for, on demand.  Finally, computing evokes a new kind of social phenomenon, namely the penetration of computing systems into society at every level, and a diversity of privacy, security and even legal issues tied to those developments.

CS5412 will focus on the technology of the cloud.  This is a narrow focus: entire courses could be constructed around such topics as the social and legal impact of cloud computing.  Indeed, we'll be even more narrow:  because CS5320  explores database and information management aspects of the cloud, we avoid overlap with that class.  Accordingly, in CS5412 we'll concentrate primarily on the architecture of today's cloud computing client systems, the evolution of the Internet to support the cloud, the architecture of modern cloud data centers, and the technologies used within them.

Of particular interest in spring 2016 will be the first and second tier cloud components that focus on elasticity, amazing scalability, instant responsiveness and the form of security needed to support web transactions such as credit-card purchases.  By the end of the course, participants will have not just a deep picture of how all this works, but also a sense of the underlying theory, the current set of research and engineering challenges, and even some cloud computing technology options that work in the lab but have yet to enter wide practice in the field.

Prerequisites:  As mentioned above, CS5412 really does require a sound background in operating systems and computer architecture: CS3410 and CS4410.  It is not necessary to have taken the Cornell versions of those courses, but it is important to have the required background.  We do not require any direct hands-on exposure to cloud computing, but all students should be proficient in a programming language such as C#, C++, Java or Python as used on an operating system like Windows or Linux.  You'll be expected to write a fair amount of code in the homework assignments, and the last one (as of today) can't be done in Java, although we may be able to support Java by the time that assignment begins.  Thus you may actually need to teach yourself another language, if Java is the only one you know.  (If so, we recommend C#: very similar in almost all ways).

Academic Integrity: We also expect all of our students to be personally familiar with Cornell's academic integrity policies. We take academic integrity very seriously at Cornell, especially in this class.  So come to class, do your share of the work in your group, and don't cheat! 

Level of the Course:  CS5412 is aimed at students in the CS or ECE MEng program, CS undergraduates who have taken the required 3xxx and 4xxx courses and done well in them, and PhD students.  We should caution that because the course isn't really wasn't developed as a research course for PhD students, the coverage of material is much more practical in emphasis than in a typical Cornell PhD course, where we would dive deeper into the theory and read current research publications.  On the other hand, Professor Birman is an active researcher in this area, and he couldn't teach a shallow cloud computing "engineering" course if he tried.  So even our non-PhD attendees may get a bigger exposure to research topics in the area than they would normally expect, and PhD students should feel welcome! 

Grading:  Grading will be based on two basic kinds of information: in-class exams (we have a prelim and a final), and the project grade.  These count for 1/3 of the points that go towards your final grade, each.  We compute a score, rank the students ("curve" the grades), and then assign letter grades to match the standard distribution of letter grades in the past ten years or so of the course.

Generally, a bit more than half the class receives grades in the A+/A/A- range, and most of the rest of the class will get grades in the B+/B/B- range.  The only way to obtain a top letter grade is to show good performance on both projects and exams.  Students with very poor exam grades, or very poor project grades, are at risk of B- or even C+ grades even if they did well on the other kind of material.  We rarely go below C+ but if a student does very badly on everything, it can happen.

The project grades are assigned to the entire team.  Everyone on the team gets the identical grade.  However, we do allow a team to reorganize if someone isn't doing their part of the work.  This is pretty rare, but it does happen.  More often, teams evolve and different people just play different roles that fit their skill set, like in a company.  We see this as a good thing, and we don't think that every person should do the identical things in every project team.  You should not automatically throw someone out of your team just because they can't do some particular very hard task. Instead, try and ask how each person can be as useful as possible, given their talents and interests.

Tools: We'll be providing a variety of computing tools.  In the final assignment students will use the Isis2 system.  Isis2 can be employed from C#, C++, Visual Basic, Iron Python and many other languages, although documentation is still under development for some of the less common options.  You can work on Windows or Linux, and will have access to real cloud resources as appropriate to your project. 

Academic Integrity:  When we feel that it is appropriate to do so, we use a fancy form of web search to compare your solutions against the others from the class, from previous years, and from the web.  Please make sure your work is your own.

MEng project option:  Cornell MEng students need 3 credits of CS5999.  There is a way to count your project to satisfy this requirement, but we expect a really great project in this case, since you are signing up for extra credits and those correspond to working extra hard.  Your CS5999 grade will be the same project grade we assign for CS5412 (the prelim and exam won't count towards the CS5999 grade).  But you can't do the  project this way unless you are enrolled for a grade in CS5412 too.  See the projects page for details.

Textbook:  We are using Ken's textbook, in the Springer Verlag "Guide to..." series:  Reliable Distributed Systems: Building High-Assurance Applications and Cloud-Hosted Services.  You can purchase this online, or in the campus store, or at the bookstore in CollegeTown.  It is also on reserve in the library as an eBook, accessible from your laptop or reading device.   Ownership of the book is not required.  We may ask you to do some readings from the book, though, so if you don't plan to own it, do make sure you can access a copy. 

Exam questions track what we cover in class, but it is common that people find that without reading the corresponding sections of the book they didn't understand the idea in depth, and then have trouble answering questions.  So we do recommend that when studying for exams, you consult with the book to be sure you have a deep insight into each topic.  There are questions you can use as self-quiz study help right in the book itself.


TA Support:  We have four people on our TA staff this spring.  None of them are full time.  The first two people took the course in Spring 2015 and are the best "go to" resources for project help.  The second two TAs are in charge of the Wednesday recitation and since that takes more time, have less office hours.

Office hours
Ken Birman Tues 1:00-2:00
Gates 435
Thu 1:00-2:00
Gates 435
Nathan Spallone Tues 12:00-1:00
Rhodes 572
Thu 10:00-11:00
Rhodes 572
Mayur Patel Mon 11:00-12:00
Gates G13
Fri 10:00-11:00
Gates G15
Theo Gkountouvas (1/3 TA) by appointment  
Zhiyuan Teo (1/3 TA) by appointment  

Description: Description: Description: \\web2.cs.cornell.edu\cs\Courses\cs5412\2012sp\41o8u8MC8iL._SS500_.jpg