The Disk-Covering Method for Phylogenetic Tree Reconstruction

Tandy Warnow

Department of Computer Science

The University of Texas at Austin

**Abstract
**Phylogenetic trees, also known as evolutionary trees, model the evolution of
biological species or genes

from a common ancestor. Most computational problems associated with phylogenetic tree reconstruction are

very hard (specifically, they are NP-hard, and are practically hard, as real datasets can take years

of analysis, without provably optimal solutions being found). Finding ways of speeding up the

solutions to these problems is of major importance to systematic biologists. Other approaches take only

polynomial time and have provable performance guarantees under Markov models of evolution; however, our recent work shows

that the sequence lengths that suffice for these methods to be accurate with high probability grows exponentially

in the diameter of the underlying tree.

In this talk, we will describe new dataset decomposition techniques, called the Disk-Covering Methods, for phylogenetic tree reconstruction. This basic algorithmic technique uses interesting graph theory, and can be used to reduce the sequence length requirement of polynomial time methods, so that polynomial length sequences suffice for accuracy with high probability (instead of exponential). We also use this technique to speed up the solution of NP-hard optimization problems, such as maximum likelihood and maximum parsimony.