Plant & Animal Genome V Conference
Town & Country Hotel, San Diego, CA, January 12-16, 1997.
PAG-V: W25 - SPLICE SITES PREDICTION AND GENE LOCALIZATION IN PLANT GENOMIC SEQUENCES
W25
SPLICE SITES PREDICTION AND GENE LOCALIZATION IN PLANT GENOMIC SEQUENCES
ROUZé, PIERRE
Laboratoire Associe de l'Institut National de la RechercheAgronomique(France), VIB, Universiteit Gent, K.L. Ledeganckstraat 35,B-9000 Gent, Belgium
Homology search of genes inside plant genomic sequences allows at best the finding of half of them, and this will not improves dramatically in the near future. Programs have therefore been developed to predict genes from the intrinsic knowledge of the sequence, most of them for animal genomes. Whatever their performance on these genomes, they mostly failed with sequences from Arabidopsis thaliana (At), due to underestimated variations in genome 'style' between organisms, the prediction of gene features having to be tailored to each of them. Several package or programs are now offering predictions for plant genomes. Using matrices developed from Arabidopsis, GeneMark Markov statistics predicts efficiently At exons, but exons borders are fuzzy and small exons missing. With S.Brunak's team, we developed NetPlantGene for At splice site prediction (Nucleic Acids Res, 24, 3439-3452). The high performance of this neural network program comes from its strategy (global information tunes the local prediction of donor or acceptor site) and the careful optimization of the networks using a cleaned At learning set (Nucleic Acids Res, 24,316-320). NetPlantGene could be used to anticipate splicing in other plant genomes: while the performance remains high for various dicots, it decreases with monocots. To go a step further, we are working on prediction of CDS borders, which would not only tell more confidently where are the first and last exons but also increase GeneMark and NetPlantGene quality. These programs are clearly complementary and will be combined to optimize gene modelling and safer annotation of Arabidopsis contigs.