How the Projects Work

 We want you to work on your project through the entire semester (some people even continue beyond the end of the semester, for extra credit). So starting on time is important.  All our recommended projects are based on cloud support for intelligent agriculture (smart farming) ideas.  We would consider allowing a few teams to work on topics not in our list, but we expect that almost all the teams will actually do projects from the list.

Projects are team based.  Although there are always one or two individuals who end up doing projects on their own, CS5412 really encourages teamwork.  Most teams will include 2 or 3 students from CS5412, and many teams will have one additional collaborator from the Cornell College of Agriculture and Life Sciences -- someone specialized in a farming topic, who can help the team do something real and valuable and "valid" in the sense of making choices that are realistic.  The farming expert might be a student in CALS, or a researcher in that unit, or even a high-tech farmer with real needs on a real farm, or a real dairy, or a real winery.  We will introduce you to these potential collaborators.

It isn't clear that we can find enough collaborators from CALS for every single team.  Accordingly, we have a backup plan: Projects that don't end up with a collaborator from the farming side will still be able to access real data from a genuine research topic of importance in CALS, but might not be able to innovate quite as much.

Everyone should select a project during the first two weeks of the semester, and we will ask you to upload a plan at the end of that period, including (1) team members, if you work with others, (2) the specific project you will do, (3) your timetable for starting to show some hands-on experience with the key elements, (4) the split of tasks within your team.

Plan to spend 4-6 hours per person, per week, on these projects. This is in lieu of other homework, and there is no final, so the workload is actually pretty much average for a Cornell class, provided that you start on time.

MEng (CS5999) Option

Some CS5412 students have historically taken an additional semester of CS5999 credits with Professor Birman, towards their required MEng project credit.  The CS5999 rule is that for each credit you take, you must put 2 hours per week in on the effort, and this adds to what you would have done for CS5412 in the first place, so it is a substantial extra effort and we expect to see a substantial "delta" to justify this.

Details are discussed in the recitations (and first lecture in the main class) but the idea in a nutshell is that you select a project for CS5412 that would be more ambitious than what you might normally have done.  If you do a project with other students, they all would be doing CS5999 for the same number of  credits as you -- every one puts in equal effort, and this means that a 2 or 3 person effort would be a pretty significant system (some people have prototyped startups this way!  A typical big example can be found by hunting for the news releases for "Remember Me", an idea for a startup tied to CS5412 a few years ago). 

We generally expect that CS5999 projects would be unique and original ideas, not just something we suggest, although there are sometimes hard open tasks in Ken's research project that a CS5999 can tackle.  Even if we did provide an idea, the breakdown of how to tackle it is your job, and we expect it done well!

At each step of the project effort we will be checking that your vision really justifies the extra credits, and that you really are doing several hours per week more than if you weren't signed up for CS5999 too.  Very often, the demos of these projects would occur after the normal demo day, to give a bit more time.  But they need to complete by the day Professor Birman hands in his grades (there is a deadline for him), or you would end up with an INC, which we prefer to avoid. 

Project grading is just the same as normal grading for the class and reflects the prelim grade, not just the project demo.  This is because a project in CS5412 needs to be in part a proof that you mastered the ideas of cloud computing, and understand issues like cloud fault-tolerance, consistency, scalability, availability, existing frameworks, etc.  We test your knowledge of that in part on the prelim.

Slide Set On Projects

We recommend that you look at this Powerpoint (pptx) or PDF file for a reminder of how projects work.

Project Goals

Azure IoT

In Spring 2019, our main goal is for every student to gain familiarity with a real cloud-based IoT platform, which would be the Azure IoT platform in most cases.  Azure IoT can be programming in any language you like (they support 40 main languages and a few additional experimental ones), and has precreated "recipes" that many students would want to consider using as a kind of template for building a secure, elastic, complete solution to the farming scenarios we'll be focused on.  Most people take existing Azure IoT micro-services, glue them together, and customize them by coding event handlers in a language like C#, C++, Scala, etc.  These handlers can be as short as just a few lines of code and you usually create them by finding an example that seems close to what you need and then modifying it into the version you want for your project.

You'll end up coding mostly in what the Azure IoT people refer to as the "elastic Function Server" event-triggered layer.  This does involve coding, but the amount of coding isn't going to be huge: lots of little event handlers, a bit like building a GUI where the user right-clicks on the image of a rock, this causes an "event" that gets handed to your "rock" object, and then the handler displays an animation of the rock splitting and Gimli the Troll leaping out.  You probably wrote code like that in CS2110.  So now you'll use that same way of thinking to accept photos from a camera in a cow barn, for example, or wind-speed updates from a sensor in a field where a drone is flying.

Projects focused on this side of Azure IoT would probably have an emphasis on new hardware (they might include an ECE sensor student creating new sensors for tracking underground water and nutrient movement, for example), or on routing data into Azure's existing micro-services.  You would have to learn how those existing solutions work to use them, so you would become an expert Azure "meta-programmer" who puts together existing tools in new ways.

Derecho micro-services (new services that can run on Azure and be used from Azure IoT)

For people who want something a bit more hard-core, consider a project that would create a new micro-service, which could then be used in the Azure IoT ecosystem side by side with the ones from Microsoft.  For this you would work in C++ (or some language like Java or Python that can import a library in C++ and call its methods).  Cornell actually has been doing some software tools that we are contributing to Azure, namely our Derecho C++ library for building new IoT micro-services, so we will also have one group of projects focused on using Derecho in Azure IoT in this way, but that particular path would only make sense for serious builders with solid C++ experience.

Derecho is really best used from C++ but if you want to work from Java or Python or some other language, you'll take an Ubuntu container with Derecho pre-installed and will write your code to "import" the corresponding library ("dll").  Once you do this, you can do remote calls to any Derecho methods that don't have what C++ calls "templated APIs", meaning you need to tell Derecho the types you are using.  Because Java and C++ don't have the same concept of types, that wouldn't work.  But you can "wire down" the Derecho handlers by creating statically typed versions that have (size_t, char*) arguments, meaning "pointers to buffers" and, from Java, can pass in that sort of pointer (it is a special kind of reference called a "strong reference" and won't try to move or garbage collect the data while you are using it this way).  Derecho's object store would be the ideal subset of the system for this sort of thing, and you can use it in this specific way from Java.  Similar advice would apply for any language you like.  You'll end up doing a tiny bit of C++ coding to create this call-through APIs for the specific methods you need to use, but those methods will just be a few lines long and mostly, you would work in your favorite coding environment.

For example, suppose that your team includes a person who is an expert on face recognition and you want to support "face recognition for farm animals".  You could build a service that accepts requests from the Azure IoT function services.  In would come an event from a camera: "got a new photo".  The function servers don't run significant logic, but could pull the photo over and store it into your Derecho-based photo classification micro-service.  Next your group member who does photo classification gets to show off: she has a deep neural network trained to classify type of animal (pig, cat, cow, goat, etc...) and then in a second step to segment the photo (find the faces of the cows) and for each face, tag it (this is "Sunny", a very good milk producer, and her calf "Bingy", who often gets into trouble and comes in covered in mud...), and then return a meta-data object containing this information.

A professional product of this kind would go further and "route" Sunny to her milking stall, Bingy over to the showers and then into a stall for his vet to have a look at that scratch on his leg, and it might track Sunny's milk production and try to correlate that with other factors such as which feed she is on, whether she spent the day in the fields or in the barn, how active she was, how long she spent ruminating and what temperature it was, and so forth.  This would probably require more than one micro-service: think of each micro-service as a specialist focused on some subset of tasks: one to do face recognition and tagging, one to do a quick check for injuries that could need attention, one to check medical records to see if Bingy needs any vaccinations while the vet is dealing with that scratch, one to try and model milk production as a function of various variables we can track, etc.  A smart farm might also have specialized services to deal with extracting fertilizer and fresh water from runoff and perhaps generating gas or bio-oil by heating waste to very high pressures and temperatures for brief periods.  Farmers are hoping these kinds of ideas could lead to healthier herds with less use of antibiotics, much less pollution, and even generating revenue from spinoff products like the bio-oil (which is a very good basis for making diesel fuel).

Other kinds of farm tasks could involve monitoring fields using drones (a cool topic: optimizing them to fly during days with light winds by "sailing" on the wind, like a sailboat, to reduce battery energy consumed), classifying any problems (drought, damage from fungus or virus, insect damage), planning a remediation (localized irrigation, or spraying, or fertilizer), planning long term actions (maybe planting different seeds that are more drought resistent right at this one spot), etc.  So one could imagine a lot of these micro-services, running side by side, specialized in very different roles.

You need the micro-service model for such tasks because function handlers are stateless and normally only run a few lines of code in total: they don't create big files and lack a place to keep machine-learned information like "facebook for Old MacDonald's dairy".   So the split of functionality gives us function handlers that mostly route the tasks, and micro-services that do these sorts of tasks.

Customers

An important goal in 2019 is for as many teams as possible to gain the experience of working with a "customer" from CALS.  As mentioned, one group of CALS users are actual farmers.  A second are students taking classes in CALS where they are running into interesting "big data" problems and need some help using the cloud (they would be the specialists on the problem domain, and you would be the cloud specialists).  And a third are CALS researchers: PhDs and professors who would like some help setting up experiments for their research projects.  For those who don't work directly with a CALS person, you would still be able to work on a problem from CALS that we've documented carefully, with presupplied data sets you can work from.  So even groups with no actual CALS person to work with would still be able to do a relevant project.

Project Examples

(Coming soon!  Some will be dairy examples, and others will focus more on greenhouses and fields and use of drones to monitor crops.  Beyond this, as mentioned, if you have the right teammates you could create new sensors and use them to create a visualization of subsurface hydration and nutrient flow, for example, or new ways to fly drones to diagnose sick trees and dispatch teams to treat them (or to cut them down).  Anyone keen to invent the world's first "subsurface macroscope" to visualize everything happening under the surface of a winery on the side of a mountain in Napa?  This is your moment to shine!).

Help extend Derecho!  This project would be supervised by the Derecho developers.  For example, one possible project would take the Derecho object store and extend it into a permissioned BlockChain for use in IoT settings.  To get permission to do this, you would need to show us that you had no difficulty using the object store and understand the way it works, and demonstrate that early in the class. You'll need good systems development skills, prior knowledge of C++, and background of the kind that might come from an undergraduate O/S course, and perhaps a basic networking course.

Help extend Azure IoT!  We are open to the possibility that someone might see a gap in the Azure IoT infrastructure and do a project to show how to fix that gap.  But you'll need to convince us that you know what you are talking about and doing, which isn't easy...

Project Demos

All projects will have two demo events.  First, we would like to see projects at the Cornell BOOM! project fair.  You will have at a minimum a poster, members of the team to explain it, and will need to do the application for a space to stand to talk about the work.  Senior Microsoft people will be there and asking questions!  If you can do a demo or at least include images from the actual IoT data, even better.

Then we will have a grading and demo event at the end of the semester, after the last class.  You will sign up for a slot, your whole team will be there, and we will come see the work and ask questions and watch you run a demo.

Hackathon

Cornell is organizing a spring hackathon with prizes.  This year's topic will be digital farming.  If your CS5412 project team participates and completes a solution to the hackathon challenge problem, you can receive 5 points extra credit by doing so.  Professor Hakim Weatherspoon is organizing the hackathon.