PAG-XIII  Plant & Animal Genomes XIII Conference

January 15-19, 2005
Town & Country Convention Center
San Diego, CA



W044 : Bioinformatics


Unsaturated Transcriptome Sampling – A Gateway To Comparative Genomics?

Stephen AG Rudd

  The Centre for Biotechnology, Tykistökatu 6, FIN-20521, Turku, FINLAND

Complete genome sequences are available for several plants and animals. Their genome scaffolds have been enriched, classified and annotated using state-of-the-art genomics and bioinformatics techniques and several databases now represent the “blueprints” from which organisms are built. Investigation of the gene content within the available plant genomes reveals that they have an exciting ancestry with repeated rounds of genome duplication, runaway retrotransposon replication and sequence loss. Comparing the different model genomes has furthermore revealed that plant genomes are not as homogeneous as may have been expected. Within comparative genomics there is an urgency to discover and characterise the diverse populations of protein families that are manifested within the more “exotic” plant species.

EST sequencing has come to the forefront of plant genomics as a simple technology that can yield at least a glimpse of the underlying gene populations. When we consider the continuum of morphological and ecological characteristics of the tens of plant species represented within the largest EST collections the potential utility for EST-based comparative genomics becomes clear.

Upon the general assumption that ESTs may not be suitable for broader comparative analyses, we have exhaustively analysed and annotated the 60 largest plant EST collections. We have enumerated quality, redundancy and applicability of the apparently representative EST collections that are associated with genomic sequences, and have surveyed the resulting patterns of tissue, developmental or stress related expression.

On the basis of these data we conclude that ESTs are indeed a suitable substrate for preliminary comparative genomics. We have constructed taxonomic maps that highlight the routes taken by evolution to yield contemporary genes and gene-families. Over four million ESTs from 60 different plant genomes have been assembled into openSputnik, a taxonomically oriented sequence database that presents networks of evolution, orthology, paralogy and domain architecture in addition to primary EST sequence. The openSputnik database is freely available for all at http://sputnik.btk.fi.