January 10-14, 2004
Town & Country Convention Center
San Diego, CA
Workshop: Bioinformatics
The great success and productivity of the sequencing projects present a major challenge to investigators working on the functional classification of the proteins encoded by these genomes. In this talk, we will present new methods under development by ourselves and others on "phylogenomic" approaches to protein functional classification.
W38PHYLOGENOMIC ANALYSIS FOR FUNCTIONAL CLASSIFICATION OF PROTEINS: IS IT WORTH THE TROUBLE?
Kimmen Sjolander1
Phylogenomic analysis includes homolog detection and clustering, multiple sequence alignment, phylogenetic tree construction, differentiation of orthologs and paralogs, and overlaying phylogenetic tree topologies with experimental data. While obviously far more demanding than a simple BLAST search for sequence homologs, this approach has a demonstrably lower expected error rate in functional annotation, and can also be helpful for detecting and correcting existing database errors and errors in gene structure.
This process was employed for the functional annotation of the human genome at Celera Genomics, through the development of a library of protein families, hidden Markov models and phylogenetic trees for all animal proteins, as described in the journal Science. A similar system is now under development by the UC Berkeley Phylogenomics Group (http://phylogenomics.berkeley.edu) to support the scientific community's efforts in both animal and plant genomics. To take advantage of the information from protein structure (and the expected new structures produced from the Structural Genomics Initiative), we also integrate protein structure prediction and structure-function analysis, and enable automatic homology model construction and prediction of key functional residues. We will present the computational methods used in our database construction, their performance on benchmark datasets, and how they compare in both practical terms and theoretically to other available tools. Finally because the task of functional annotation and curation of genomes is far too big for any one group, this server is designed to facilitate community-wide collaboration through an integrated graphic user interface, allowing investigators who are separated physically to work jointly in a virtual collaborative environment.
Return to Previous Page or Intl-PAG Homepage