CS6465: Emerging Cloud Technologies and Systems Challenges
Room TBA, Tuesday/Thursday 1:25-2:40
CS6465 is a PhD-level class in systems that tracks emerging cloud computing technology, opportunities and challenges. It is unusual among CS graduate classes: the course is aimed at a small group of students, uses a discussion oriented style, and the main "topic" is actually an unsolved problem in computer systems. The intent is to think about how one might reduce that open problem to subproblems, learn about prior work on those, and extract exciting research questions. The PhD focus centers on that last agenda element.
This course is open to ANYONE in the CS PhD program, but everyone else needs permission from the instructor.
CS6465 is only open to CS PhD students and to other students with the instructor's explicit permission. To get that permission you need to convince Ken that
Now, in fact, this PhD thing is interpreted in a very flexible way. Historically we have often had CS undergraduates in the class: students who were already solidly on a track leading to PhD research in graduate school. But in contrast, we have almost never had MEng students in this class: CS5412 in the spring is a much better choice for an MEng student, because MEng is just not a research track.
We welcome PhD students from programs like ECE, CAM, or other programs, as long as they have background similar to the CS PhDs. But you really do need the background if you want to attend a course like this, so don't be unrealistic about the importance of having a solid grounding in computer systems before attending this particular class. You won't hate the experience, but just won't be able to contribute in a meaningful way without that.
How does the class work?
CS6465 was introduced in 2018. In this third offering, the focus will be on issues raised by moving machine learning to the edge of the cloud. In this phrasing, edge computing still occurs within the data center, but for reasons of rapid response, involves smart functionality close to the client, under time pressure. So you would think of an AI or ML algorithm written in as standard a way as possible (perhaps, Tensor Flow, or Spark/Databricks using Hadoop, etc). But whereas normally that sort of code runs deep in the cloud, many minutes or hours from when data is acquired, the goal now is to keep the code unchanged (or minimally changed) and be able to run on the stream of data as it flows into the system, milliseconds after it was acquired. We might also push aspects of machine learned behavior right out to the sensors.
This idea is a big new thing in cloud settings -- they call it "edge" computing or "intelligent" real-time behavior. But today edge computing often requires totally different programming styles than back-end computing. Our angle in cs6465 is really to try and understand why this is so: could we more or less "migrate" code from the back-end to the edge? What edge functionality would this require? Or is there some inherent reason that the techniques used in the back-end platforms simply can't be used in the edge, even with some sort of smart tool trying to help.
The goal of this focus on an intelligent edge is, of course, to motivate research on the topic. As a systems person, Ken's group is thinking about how to build new infrastructure tools for the intelligent edge. Those tools could be the basis of great research papers and might have real impact. But others work in this area too, and we'll want to read papers they have written.
Gaps can arise at other layers too. For example, Tensor Flow is hugely popular at Google in the AI/ML areas, and Spark/Databricks plus Hadoop (plus Kafka, Hive, HBase, Zookeeper, not to mention plus MatLab, SciPy, Graphlab, Pregle, and a gazillion other tools) are insanely widely used. So if we assume that someone is a wizard at solving AI/ML problems using this standard infrastructure, but now wants parts of their code to work on an intelligent edge, what exactly would be needed to make that possible? Perhaps we would need some new knowledge representation, or at least some new way of storing knowledge, indexing it, and searching for it. This would then point to opportunities for research at the AI/ML level as well as opportunities in databases or systems to support those new models of computing.
CS6465 runs as a mix of discussions and short mini-lectures (mostly by Ken), with some small take-home topics that might require a little bit of out-of-class research, thinking and writing. Tthere won't be a required project, or any exams, and the amount of written material required will be small, perhaps a few pages to hand in per week. Grading will mostly be based on in-class participation.
CS6465 can satisfy the same CS graduate requirements (in the systems area) as any other CS6xxx course we offer. Pick the course closest to your interests, no matter what you may have heard. CS6410 has no special status.
Schedule and Readings/Slides (Fall 2018 version; we will be updating the specific list of papers for Fall 2019, but it won't change dramatically)