CS5412: Foundations of Cloud Computing (Spring 2014)
Lectures Tuesday/Thursday 2:55pm - 4:10pm, Hollister Hall B14
Recitation/Discussion Section (starts January 28): W 7:30pm - 8:20pm, Bill and Melinda Gates Hll G01
Prof. Ken Birman, 435 Gates Hall, x5-9199
What is this course about? Cloud Computing is an overarching term that covers modern computing infrastructures to support the web: browsers and web servers, but also ways of building mobile clients, scalable web services, and very fast infrastructures for serving up content in geographically distributed systems that might include dozens of data centers and millions of computers.
CS5412 tackles this very large topic in a way that strikes a balance, looking at the large-scale architecture of modern cloud computing systems while simultaneously exploring theoretical underpinnings, limitations, and ways of guaranteeing the quality of cloud-hosted applications. The projects expose students to some examples of practical cloud computing tools and solutions.
Who should take this class? CS5412 aims at MEng students who are curious about just how far cloud computing can be taken, and to have a deeper insight into the cloud and who like material to have a balance of theory and practice. The course is for students who are already strong programmers with plenty of experience writing systems software, and who want to take the next step and start to use cloud computing technologies. This is not a suitable course for programmers who are just getting started and who lack that kind of experience.
What background is required? CS5412 assumes that you have a solid background doing systems programming, networking and operating systems, and have taken CS architecture and theory classes. You don't need to have taken them at Cornell, but you should have this background.
CS5412 includes several projects that require prior knowledge of the C# programming language, Python, C++, F# or Ruby. Some projects can only be done using versions of these languages that are supported on Microsoft's .NET framework, either on a Windows machine using Visual Studio, or on a Linux machine using Mono.
It is important to understand that the class won't teach these background skills, or the detailed API for using the cloud programming tools that arise in our projects. Instead, we expect you tobe a self-learner who can teach yourself what you need to know, filling in gaps in your skills set. For example, perhaps you already know C++ well, but are unfamiliar with the C++/CLI system that Microsoft uses on .NET. Rather than us teaching this to you, we would you to teach yourself about the "hat" operator in C++/CLI and thus be able to use C++/CLI with .NET. There is a great deal of online documentation and even instructional videos for each of the technologies you would need to use in your projects. Our TAs and instructor are always available to give pointers, but you will need to do a fair amount of reading and out-of-class work to complete the projects and even to really learn material covered in class in a deep way.
Newsgroup for posting questions: We'll be using
Newsgroup for posting questions: We'll be using Piazza.
More complete synopsis: Not many people realize it, but the area known as cloud computing was actually invented here at Cornell! We were early leaders in building larger and larger clusters and management tools (for example, Cornell software plays key roles in Microsoft's "Longhorn" cluster management tools, and in Oracle's network management tools for their database products, and was the basis for the event notification layer that was used in IBM WebSphere for ten years, to name just a few examples). Then when Werner Vogels, who was a researcher here, joined at Amazon's new CTO in 2006, he invented modern cloud computing and the area went crazy! If you want to learn about cloud computing in 2015, this is the place to study that topic.
Cloud computing has become a major commercial area for Internet product development and activity, so it can be surprising to realize that the term itself has many uses. We use the cloud whenever we search the web, post a photo to Facebook, or use the mobile version of Google Maps for driving directions. Cloud computing enables a new kind of computation in which staggering amounts of data can be culled from sensors world-wide and then employed as the basis for problem solving in new styles that need to also be massively parallel, since the data ends up spread over large numbers of machines, with no single machine having more than a "shard" of the big picture. Many users think of the cloud as the ultimate "rent-a-machine" computing solution: as many virtual computing nodes as you might care to pay for, on demand. Finally, computing evokes a new kind of social phenomenon, namely the penetration of computing systems into society at every level, and a diversity of privacy, security and even legal issues tied to those developments.
CS5412 will focus on the technology of the cloud. This is a narrow focus: entire courses could be constructed around such topics as the social and legal impact of cloud computing. Indeed, we'll be even more narrow: because CS5320 explores database and information management aspects of the cloud, we avoid overlap with that class. Accordingly, in CS5412 we'll concentrate primarily on the architecture of today's cloud computing client systems, the evolution of the Internet to support the cloud, the architecture of modern cloud data centers, and the technologies used within them.
Of particular interest in spring 2014 will be the first and second tier cloud components that focus on elasticity, amazing scalability, instant responsiveness and the form of security needed to support web transactions such as credit-card purchases. By the end of the course, participants will have not just a deep picture of how all this works, but also a sense of the underlying theory, the current set of research and engineering challenges, and even some cloud computing technology options that work in the lab but have yet to enter wide practice in the field.
Prerequisites: As mentioned above, CS5412 really does require a sound background in operating systems and computer architecture: CS3410 and CS4410. It is not necessary to have taken the Cornell versions of those courses, but it is important to have the required background. We do not require any direct hands-on exposure to cloud computing, but all students should be proficient in a programming language such as C#, C++, Java or Python as used on an operating system like Windows or Linux. You'll be expected to write a fair amount of code in the homework assignments, and the last one (as of today) can't be done in Java, although we may be able to support Java by the time that assignment begins. Thus you may actually need to teach yourself another language, if Java is the only one you know. (If so, we recommend C#: very similar in almost all ways).
We also expect all of our students to have reviewed Cornell's academic integrity policies. We take academic integrity very seriously at Cornell, especially in this class. So come to class, do your own work, and don't cheat!
Level of the Course: CS5412 is aimed at students in the CS or ECE MEng program, CS undergraduates who have taken the required 3xxx and 4xxx courses and done well in them, and PhD students. We should caution that because the course isn't really wasn't developed as a research course for PhD students, the coverage of material is much more practical in emphasis than in a typical Cornell PhD course, where we would dive deeper into the theory and read current research publications. On the other hand, Professor Birman is an active researcher in this area, and he couldn't teach a shallow cloud computing "engineering" course if he tried. So even our non-PhD attendees may get a bigger exposure to research topics in the area than they would normally expect, and PhD students should feel welcome!
Grading: Grading will be based on a number of elements, with the relative weighting decided by Professor Birman and his TAs at the end of the semester. Components include homework assignments, exams and an integrative project. Usually the prelim and final are about 50% of the grade, and the projects are the other 50%. The exams tend to be based mostly on material covered in class, and we do provide slide sets and other study materials to help you prepare.
Generally, a bit more than half the class receives grades in the A+/A/A- range, and most of the rest of the class will get grades in the B+/B/B- range. However, strong grades require good performance on both projects and exams. Students with very poor exam grades, or very poor project grades, are at risk of C+ or lower letter grades even if they did well on the other kind of material.
Tools: We'll be providing a variety of computing tools. In the final assignment students will use the Isis2 system. Isis2 can be employed from C#, C++, Visual Basic, Iron Python and many other languages, although documentation is still under development for some of the less common options. You can work on Windows or Linux, and will have access to real cloud resources as appropriate to your project.
Academic Integrity: We use a fancy form of web search to compare your solutions against the others from the class, from previous years, and from the web. Please make sure your work is your own.
MEng project option: Cornell MEng students need 3 credits of CS5999. There is a way to do a project in addition to the homeworks to satisfy this requirement. See the projects page for details.
Textbook: We are using Ken's textbook, in the Springer Verlag "Guide to..." series: Reliable Distributed Systems: Building High-Assurance Applications and Cloud-Hosted Services. You can purchase this online, or in the campus store, or at the bookstore in CollegeTown. However, the book is not required and we've put copies on reserve in the library for people who prefer not to own their own copy. We may ask you to do some readings from the book, though, so if you don't plan to own it, do make sure you can access a copy. We may also ask you to read some published papers from the cloud computing literature. If we do end up having in-class quizzes, they would often be based on those assigned readings or on things covered in class on Ken's slides, copies of which are online in case you need to revisit them when trying to understand things better.
TA Support: We have
one full time TA and three other (very) part-time TAs.