PAG-XIII  Plant & Animal Genomes XIII Conference

January 15-19, 2005
Town & Country Convention Center
San Diego, CA



P844 : Software


Selected Bioinformatics Tools For Sequencing And Data Analysis Developed At ACGT

Axin Hua , Stephen Kenton , Bruce A. Roe

  Department of Chemistry and Biochemistry, Stephenson Research and Technology Center, University of Oklahoma, 101 David L. Boren Blvd, Norman, Oklahoma, 73019

To support our large-scale sequencing and data analysis projects, we have developed a series of software tools, among them are Exgap, Autofish, Maxmatch. Exgap can visualize the status of sequencing projects that automatically (a) orders contigs based on their forward-reverse mate pairs, (b) lists the subclones covering gaps for selecting gap-closing PCR primers, and (c) locates potential miss-joints for further examination. Data from Whole Genome Shotgun projects can be helpful in finishing individual BAC projects of those organisms. Autofish is a search tool that employs a fast searching algorithm based on hashing to perform the initial search for HGS reads that overlap a BAC-based sequencing project. These reads then are screened based on mate pairs and base quality, and only the reads with mate pairs matching the given projects and without high quality base discrepancies are kept and added to individual BAC projects to facilitate final sequence closure and finishing. Maxmatch is a tool to perform fast sequence comparison for large projects in the range of up to 10 Mb that gives output similar to that of Dotter but require less than 10 minutes on a typical SUN workstation to compare two 2-MB bacterial projects while Dotter require several hours. Maxmatch employs a suffix tree data structure to perform the search and only near exact matches are kept with only a minor loss of sensitivity in exchange for the high speed. Example applications of these software tools will be presented.
The above software is available at URL: http://www.genome.ou.edu/informatics