"Phylogenies and Complexity: A New Approach"Andrés Varón
Division of Invertebrate Zoology, American Museum of Natural History
Computer Science Department, The City University of New York
Department of Biology Seminar
April 30, 2007
Marsh Life Science 105
In modern systematics, the two most important character-based optimality criteria are Maximum Parsimony (MP) and Maximum Likelihood (ML). As larger and more complex molecular sequences are being produced for phylogenetic analyses, the phylogenetically informative events to be now considered not only include substitutions, insertions, and deletions, but events such as translocations, inversions, and horizontal gene transfers, among others. A general phylogenetic analysis framework capable of taking into consideration these kind of events is known in the phylogenetic literature as Dynamic Homologies (DH); there, the raw molecular sequences are analyzed in the context of each hypothetical tree, for which the ML or MP score is calculated based on a number of "legal" editions and their associated scores (e.g., the Tree Alignment Problem).
Regardless of the freedom that DH allows, in practice it is very difficult to establish a reasonable set of parameters under any optimality criteria because the lack of formalism in the cost assignments under MP, as well as the unlikeliness of finding a "real" set of parameters under ML. I will propose the use of the Minimum Description Length principle, with roots in the Kolmogorov Complexity Theory, as a more general optimality criterion to address these problems. I will first present a basic approach to the problem, and then show results from analyses of real-world datasets using the techniques here outlined, as prototyped in POY version 4 beta.