January 10-14, 2009
Town & Country Convention Center
San Diego, CA
Ramesh Buyyarapu1 , Ramesh Kantety1 , Zhanyou Xu2 , John Yu2 , Russel Kohel2 , Richard Percy2 , Simone Macmil3 , Graham Wiley3 , Bruce Roe3 , Govind Sharma1
New and emerging next generation sequencing technologies have been promising in reducing sequencing costs, but not significantly for complex polyploid plant genomes such as cotton. Large and highly repetitive genome of G. hirsutum (~2.5GB) is less amenable and cost-intensive with traditional BAC-by-BAC sequencing approach. Therefore, our objective was to sequence large genomic segments from cotton genome using a novel BAC-pool sequencing approach using 454-pyrosequencing technology and to test our ability to i) assemble the sequences, ii) determine the approximate number of genome equivalents required to thoroughly cover the genome, iii) identify any coding regions in the assembled contigs, and iv) characterize the repetitive regions. Two BAC contigs, #465 (chromosome 26) and #3301 (chromosome 12) from homeologous cotton chromosomes were selected for sequencing. Three to four BAC clones were pooled and sequenced using the GS20 FLX instrument. Contig-465 was sequenced up to 20.45X coverage from 12 BAC clones with 4 clones per each pool. Similarly, ~767 KB contig 3301 was sequenced up to 23.11X coverage from 7 BAC clones with 4 and 3 clones under each pool. Approximately 50MB of raw sequence data was assembled into 1.01 MB with several contigs among the five pools using Newbler and Phrap assembly programs. Sequencing data was further analyzed to verify the presence of BAC-end regions, molecular markers, ESTs, transposons, retrotransposons, coding regions, and matches with other related genomes. Pooled BAC sequencing using 454 technology helps in producing the de novo sequencing and assembling of large genomic segments rapidly and economically.