PAG-XIII  Plant & Animal Genomes XIII Conference

January 15-19, 2005
Town & Country Convention Center
San Diego, CA



P846 : Software


"A Comparison Of Multiple Sequence Alignment Programs And The Effect Of Alignment On Phylogenetic Quality"

Kevin G. Beckmann

  PSU, 348 Blue Course Drive Bldg 13 Apt 110C State College, PA 16803

This analysis compares the quality scores of multiple sequence alignments generated by ClustalW, DB Clustal, Dialign, Muscle, and Poa to the effect of different alignments on phylogenetic tree reconstruction. We used reference alignments from Balibase that varied in sequence length, taxa number, and percent identity. Alignment quality was evaluated using a built-in scoring method, Baliscore, which calculated Sum-of-pairs (SP) and Total-column (TC) scores for each experimental alignment against the reference alignment. Across Balibase datasets one through five, the quality scores of Muscle are consistently higher than those of other multiple alignment algorithms. For phylogenetic analysis, maximum likelihood trees were produced for each alignment using Paup and the trees were scored by distance to a reference tree (produced from the reference alignments) and also the average node support according to bootstrap analysis. Both of these results show that there is a positive correlation between alignment quality and phylogenetic quality. This study demonstrates that the Muscle algorithm provides the best multiple alignment for most of the data sets available in the Balibase benchmark. This study also shows for the first time that higher scores for alignment quality can be directly correlated with an improved phylogeny.