PAG-XII  Plant & Animal Genomes XII Conference

January 10-14, 2004
Town & Country Convention Center
San Diego, CA


Workshop: Bioinformatics


W40

A STRATEGY FOR ASSEMBLING THE MAIZE (Zea mays L.) GENOME

Scott J. Emrich1 , Srinivas Aluru2 , Yan Fu3 , Tsui-Jung Wen4 , Mahesh Narayanan2 , Ling Guo1 , Daniel A. Ashlock5 , Patrick S. Schnable4

1 Bioinformatics & Computational Biology Graduate Program, Iowa State University, Ames, IA, 50011 USA
2 Department of Electrical and Computer Engineering, Iowa State University, Ames, IA, 50011 USA
3 Interdepartmental Genetics Graduate Program, Iowa State University, Ames, IA, 50011 USA
4 Center for Plant Genomics, Iowa State University, Ames, IA, 50011 USA
5 Department of Mathematics, Iowa State University, Ames, IA, 50011 USA

MOTIVATION: Because the bulk of the maize (Zea mays L.) genome consists of repetitive sequences, sequencing efforts are being targeted to its “gene-rich’’ fraction. Traditional assembly programs are inadequate for this approach because they are optimized for a uniform sampling of the genome and inherently lack the ability to differentiate highly similar paralogs. RESULTS: We report the development of bioinformatics tools for the accurate assembly of the maize genome. This software, which is based on innovative parallel algorithms to ensure scalability, can assemble 730,974 GSS fragments in four hours using 64 Pentium III 1.26 GHZ processors of a commodity cluster. Algorithmic innovations are used to significantly reduce the number of pairwise alignments without sacrificing quality. Novel approaches were developed that use known sequences to model error rates for improved differentiation of polymorphisms versus sequencing errors. The assembly was also used to evaluate the effectiveness of various filtering strategies and thereby provides information that can be used to focus subsequent sequencing efforts.


Return to Previous Page or Intl-PAG Homepage