PAG-X  Plant, Animal & Microbe Genomes X Conference

January 12-16, 2002
Town & Country Convention Center
San Diego, CA


Bioinformatics: Software
             


DEVELOPMENT OF A SEMI-AUTOMATED PIPELINE FOR THE ANNOTATION OF RICE SEQUENCE

Lance E. Palmer1 , Harshawardhan Bal1 , Neilay N. Dedhia1 , W. Richard McCombie1

1 Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724 USA

We have developed a semi-automated annotation pipeline for the annotation of BAC sequences from rice (Oryza sativa). This process automatically performs blastp searches using the predicted proteins from a number of gene prediction program against various protein databases. In addition several blast searches are performed against different EST databases using the whole BAC as a query. Subsequent to the blast searches, an automated `pre-annotation' step is performed. This program uses the results from the blastp searches to assign initial annotation to proteins. After the automated portion of the program, the results of the gene prediction programs and the blast searches are loaded into a graphical viewer that allows an annotater to view and alter annotations. In addition to gene models deduced by gene prediction programs, user defined gene models can also be constructed and annotated. This graphical view also contains a number of analysis tools that can determine sensitivity and specificity of gene prediction programs and to determine exon/intron sizes and ratios. After annotation, the results can be written in a file that is Sequin readable for submission to Genbank. Although originally written for the annotation of rice, our set of programs were designed to accommodate the annotation of sequence from any species, and to allow for various types of blast search strategies.


Return to Previous Page or Intl-PAG Homepage