January 14-18, 2006
Town & Country Convention Center
San Diego, CA
Nandini Krishnamurthy , Daniel Kirshner , Duncan Brown , Carolina Dallett , Shalini Lobo , Kimmen Sjolander
Roughly 30% of the genes in eukaryotic genomes have no known (or predicted) function. Of those with functional annotations roughly 3% have experimental evidence supporting their annotation, the remaining have been assigned a molecular function by homology, with questionable accuracy. Homology-based function prediction is known to be prone to systematic errors due to gene duplication, domain shuffling, and existing database annotation errors.
Phylogenomic inference of protein function has been proposed to address the problems associated with transfer of annotation by homology. This process requires clustering globally alignable homologs, constructing a multiple sequence alignment, masking uncertain regions of the alignment prior to phylogenetic analysis, phylogenetic tree construction, and identification of conserved clades (or functional subfamilies). Experimental data is then overlaid on the tree topology, enabling the biologist to infer function for a particular sequence in an evolutionary framework.
We present two web resources enabling genome-scale phylogenomic analysis of protein families in plant and animal genomes. The Animal Proteome Explorer includes specialized libraries for ion channels and G-protein coupled receptors, while the Plant Proteome Explorer focuses on protein families involved in disease resistance. Each will eventually include all proteins encoded in animal and plant genomes. These two web servers provide pre-calculated phylogenetic relationships for thousands of proteins, with predictions of functional subfamilies, 3D structure, cellular localization and biological process. Hidden Markov models (HMMs) are available for each family and subfamily in the resource. Biologists can submit sequences for functional or structural classification or simply browse the books in our Phylogenomic Libraries.