January 12-16, 2002
Town & Country Convention Center
San Diego, CA
Poster: Other
Computer: Poster and Demo Database and software
The EST Analysis Pipeline (ESTAP) is a collaborative project to provide informatics support to a distributed group of researchers working on a variety of target organisms. Raw data from sequencing platforms is submitted to ESTAP for validation and upon validation is uploaded to a relational database. Subsequently a series of automated routine analyses are performed on the data including operations that measure base quality, remove vector, adaptor, and primer sequences as well as contaminating sequences and do other rather common procedures like homopolymeric sequence trimming. Parameters for all protocols are selectable by the data owner and recorded in the database. A tool is provided to allow the automatic submission of the high-quality sequences to Genbank. Blast analysis is performed on the cleaned sequences, again using parameters selected by the data owner. A variety of views of the data in the database is provided by a Web interface. The pipeline is designed to be extensible. Current plans for additional functionality include assembly of EST sets, annotation with gene index information, gene ontology, and linkage of the EST data to expression information.