January 10-14, 2004
Town & Country Convention Center
San Diego, CA
Workshop: Bioinformatics
MOTIVATION: Because the bulk of the maize (Zea mays L.) genome consists of repetitive sequences, sequencing efforts are being targeted to its “gene-rich’’ fraction. Traditional assembly programs are inadequate for this approach because they are optimized for a uniform sampling of the genome and inherently lack the ability to differentiate highly similar paralogs. RESULTS: We report the development of bioinformatics tools for the accurate assembly of the maize genome. This software, which is based on innovative parallel algorithms to ensure scalability, can assemble 730,974 GSS fragments in four hours using 64 Pentium III 1.26 GHZ processors of a commodity cluster. Algorithmic innovations are used to significantly reduce the number of pairwise alignments without sacrificing quality. Novel approaches were developed that use known sequences to model error rates for improved differentiation of polymorphisms versus sequencing errors. The assembly was also used to evaluate the effectiveness of various filtering strategies and thereby provides information that can be used to focus subsequent sequencing efforts.
W40A STRATEGY FOR ASSEMBLING THE MAIZE (Zea mays L.) GENOME
Scott J. Emrich1
, Srinivas Aluru2
, Yan Fu3
, Tsui-Jung Wen4
, Mahesh Narayanan2
, Ling Guo1
, Daniel A. Ashlock5
, Patrick S. Schnable4
Return to Previous Page or Intl-PAG Homepage