Project Information: Spring 2013

A few of the 40 animals in the CS 2110 bestiary. Credit for the pictures goes to our talented artistic consultant Ursula Hilsdorf.
Swamp Slime Shy Frecklepuss Fuzzy Trible Gray Floop Ballard's Protoduck
Swamp Slime Shy Frecklepuss Fuzzy Trible Gray Floop  Ballard's Protoduck 

This page gives an overview of the entire 5-part assignment for the semester. Below, you'll find links that will take you to the detailed versions of each of the five assignments. You'll use CMS to hand in your solutions.

For section mini-assignments, see CMS.

Deciphering an Evolutionary Tree...

During the spring of 2013, CS2110 students will be working on a semester-long project that has a series of intermediate deliverables. The first and last assignment must be done individually, but we allow teams of 2 (no more) for assignments A2, A3 and A4.  Even if you are part of a larger study group of friends who are all taking CS2110 together, teams can't collaborate with other teams! You are welcome to have non-specific dialogue about the project with friends or a study group, but every line of code you type needs to be written by you personally, or by you and your teammate on A2/A3/A4, and you can't show other people your solutions before the deadline for handing them in.

The goal of the project is to use genome data to figure out the relationship between a set of animals. We'll provide you with biological data files, and as part of that data we'll give you the animal's DNA. Using code we develop during the semester, we'll tackle problems such as extracting genes from the DNA, comparing to see how similar two genes are, and finally building a graph (a tree) representing the evolutionary history of the creatures and displaying the data in web pages. The problem is a simplification of one of the most important problems seen in modern biology.

Genomic computations in "reality" can be fairly messy, so for CS2110 we'll be using a somewhat simplified genetic model. The simpler version isn't completely realistic, but it does parallel the actual computational biology problem, while limiting itself to technical ideas that are at the right level for students taking a second semester Java course. We're hoping that you'll discover that without learning a huge amount of computer science, you can already do some really amazing things. We also want to see you get experience writing high-quality code – this is a skill you can only learn by doing.

Our semester-long project will be broken into five separate sub-projects, each linked to material we'll be learning in lecture. Each subproject will involve implementing some set of Java classes that supports a specific interface that we'll define carefully.  You'll need to implement that interface (without changing it in any way) because we have a semi-automated testing procedure that can only work if you use the identical interface that we provided.  You'll have a way to check that you've complied with this aspect of the assignment before turning your solution in.

After each of the first three subprojects is handed in, we'll release a standard solution for that same problem. You won't have to use it, but this lets you move on and tackle the next subproject even if your version for an earlier stage had serious bugs. Just the same, we really would encourage you to work with your own code.

To get full credit for each subproject, your assignment must not only work, but should adhere to our CS2110 code style guidelines.

... and in the process, learning independent problem solving and programming skills

People who work with computers find that computational thinking and problem solving is a skill that most people need to learn (some are naturals, but relatively few), and that improves with practice.  This means that no matter how much effort we go through to teach things in class, and in recitation, you won't really have learned the material until you reach a point of being able to sit down and use computers to solve hard problems on your own.

This key perspective challenges all of us, because the very first time you try to solve a problem it can seem maddeningly hard and obscure.  Getting past that mental block and realizing that yes, I can do this is a matter of patience, thinking hard, and often of trying things out that just don't work, sometimes for deep reasons and sometimes for silly ones. Programs are notoriously unforgiving of even the most minor mistakes.  And that's a process you simply need to go through as you become a skilled practitioner of the dark art of computer programming!

Now all of this comes together in our cs2110 project, which is where we want you to learn this talent and to develop it.  But the complicating factor is that there is so much existing code "out there" and reuse of code is more than just common, in some situations it is the only way one can work!

To what extent is it acceptable to google for something, then copy a fragment of code you find on the web?  In cs2110, we are ok with you copying examples from Java.Oracle.Sun.  If you find an example on that site that is useful for doing something you need to do, you are welcome to copy that code and use it.  Please indicate that you did so in your comments so that we'll understand why the style changed, because these cut-and-paste code fragments often look like they were written by someone else.  But if it came from the company that built Java, you are permitted to copy it.  And this is valuable because for many kinds of things (code that accesses files, or does graphics) you have no real choice.  There is often only one way to do those kinds of operations and you need examples to see the approach that works.  Once you've seen them, there is no real reason to create your own version of the same thing.

But this is limited to code from Java.Oracle.com.  You are NOT permitted to copy code from other places.

In particular, the 2013 version of the project is based on the project we used in Fall 2010.  Naturally, we've made a lot of changes that should prevent people from making use of old Fall 2010 solutions.  Yet for some people it may be tempting to copy Fall 2010 solutions and to try and modify them into spring 2013 solutions.  Please don't do this.

 In fact, we believe that we've changed things enough that it would be easier to develop new solutions than to reuse the fall 2010 versions.  Moreover, because of our automated "cheating detection" (see below) if you were to laboriously transform a fall 2010 solution into a spring 2013 one,  you run a big risk that we would flag your program as a possible academic integrity violation.  All of this explanation is to make it clear that you really do need to develop fresh new solutions, and that even though the assignment definitely is based on one we've used before, that doesn't mean that you should just try and find old solutions, touch them up a bit, and then pass them off as new ones.  That wouldn't work and would run a real chance of causing you to fail the class, or worse!

Two more small remarks: The assignments are designed to be solvable by a person working by him or herself. We allow teams on A2-A4 but aren't really encouraging or assuming that you'll work in teams. In fact it is better for you to work on your own, just as a running who trains by jumping in the team car for half of each workout probably won't end up being as strong as a runner who runs the full distance on her own! 

The other thing to notice is that it is impossible to complete CS2110 without doing the first assignment (A1), the final assignment (A5) and the exams on your own.  So it is going to be very important to develop independent problem solving and programming skills outside of a team context. Some people have a tendency to rely a bit too much on partners in a team and can find that this backfires when then face problems on their own in an exam setting, or A5.  This is why we recommend that you only team up with someone if you enjoy working with them but also if both of you feel that you have similar skill sets and abilities.  We strongly recommend that you not enter into a team arrangement with someone much stronger or weaker than you because it will backfire later.

The Five Assignments

(coming later in spring 2013; we're revising an older version of this project and haven't finished our changes yet)

Automated Cheating Detection

We've noticed that some Cornell classes do a so-so job of enforcing the academic integrity policy. In CS2110, we have a solution: we're using an automated system that uses sophisticated artificial intelligence techniques combined with some pretty fancy program analysis tools to notice unusual similarities between programs turned in by different people. It is important to realize that these tools really work and that they are quite hard to fool. So while it might seem tempting to borrow a solution from a buddy, change the variable names and comments, or reorder the statements, our tools would be very likely to figure out what you did. We take cheating seriously, and cheating with an attempt to cover it up is grounds for failing the course outright. Realistically, it is much easier to just do the assigned problems than to get away with handing in code someone else wrote, because short of rewriting that code completely from scratch, we’ll catch it.

So you’ve been warned: It is just not possible to get away with cheating in CS2110. Please do your own work, and come talk to Professor Birman, or the TAs, as often as needed if you get stuck and need help. We’ll get you back on track. In contrast, borrowing a solution from a pal will just get both of you into very serious trouble.