About Recursion in PhylogenyTree

There has been a lot of confusion about PhylogenyTree, so this page will attempt to answer a few questions. Note that this page is fairly long. If you do not have time to read it all in one sitting, you can skip to the summary at the bottom of the page, but you should still come back to read the rest of the page if you still have questions after reading the summary.

Recursion in PhylogenyTree structure

First of all, yes, the tree represented by PhylogenyTree is a recursively defined structure. A single PhylogenyTree object is a single node in the tree (and each node will contain one animal), but at the same time, its getParent method returns a PhylogenyTree, and its getChildren method returns a list of PhylogenyTree. Think about binary trees, where we say a single node is a tree, and it can have two children, which are themselves trees too (so the children can have children, which can also have children... basically, it's trees all the way down!). The same idea applies for PhylogenyTree.

So with the following code:

Set<Species> animals = ???; // All 40 animals from the .dat files Species biscuit = ???; // The species named Biscuit PhylogenyTree tree = new PhylogenyTree(animals, biscuit);

When the constructor is finished doing its job, tree holds a PhylogenyTree whose animal is Biscuit, and somewhere under it (as its descendants) are 39 other animals. So remember that this constructor is responsible for the entire tree-building process.

Recursion in PhylogenyTree constructor?

Now, since we have all this recursion going on in the structure of the PhylogenyTree, the logical question to ask next is whether you should use recursion in your PhylogenyTree constructor.

The short answer is: No, you should really stick to the pseudocode in the assignment specification, which just does things with a while loop.

The long answer is: Okay, you can do it, but there are a few considerations that you need to be aware of, which are explained below.

The first pitfall to watch out for: Let's say we want to add the Nocturnal Plexum as a child of Biscuit. We use the following code in our constructor:

Species newChild = ???; // Some Species, in this case Nocturnal Plexum) PhylogenyTree child = new PhylogenyTree(animals, newChild);

Bad things will happen. Remember that this constructor is responsible for the tree-building process. So you just asked to create a new tree with Nocturnal Plexum as the root, and 39 other animals under it. That constructor will also ask to create more trees, and each of those trees will have all 40 animals... you're getting yourself stuck in an infinite loop!

So this leads to the second consideration to think about. Let's say you solve the first problem by using this code:

Species newChild = ???; // Some Species, in this case Nocturnal Plexum) // copy old set, because we must not modify it Set<Species> animals2 = new HashSet<Species>(animals); animals2.remove(newChild); PhylogenyTree child = new PhylogenyTree(animals2, newChild);

So you are now recursively calling the PhylogenyTree constructor for the Nocturnal Plexum, which is the code that will get executed next. The PhylogenyTree constructor for Biscuit is now waiting for the Nocturnal Plexum constructor to complete. Let's say that in the Nocturnal Plexum's constructor, you choose to add a new Species, Armored Snapper. But the catch is that Armored Snapper should be added as a child of Biscuit, not of Nocturnal Plexum. How do you "go back" to Biscuit and notify it that it should add Armored Snapper as its child? Conversely, if Armored Snapper should be added as a child of Nocturnal Plexum, how do you let Biscuit know that it should not attempt to add Armored Snapper, since it's already been added? Finally, Nocturnal Plexum's constructor must be able to know what the distance between Armored Snapper and Biscuit was for it to be able to decide who "gets" to have Armored Snapper as a child. And when you have 10 animals in the tree, again, you need to decide which node gets to add a child: it might not be the constructor that is currently executing!

In summary

Remember that the PhylogenyTree(Set<Species> animals, Species root) constructor is responsible for the entire tree-building process. In the process of building the tree, you will need to create individual nodes for each animal, and you will probably want to define and use a different constructor for that purpose.

If you want to use recursion in PhylogenyTree's constructor, you must be able to answer the following questions:

  1. How do you ensure the recursive calls do not try to re-add animals that are already inside the tree? (That would cause an infinite loop!)
  2. How do the recursive calls "coordinate" and decide which node in the tree gets to add a new child?
  3. If a recursive call needs to signal to a call earlier in the recursive call chain that it should add a new child, how exactly does it pass this signal along?

It is possible to answer these questions and use recursion in the PhylogenyTree constructor (and you are welcome to do so). However, we do believe it is easiest to use a loop like in the specification's pseudocode.