January 15-19, 2011
Town & Country Convention Center
San Diego, CA
Srikar Chamala1, 2 , Brandon Walts1,2,9 , Victor Albert3 , Claude dePamphilis4 , Joshua Der4 , James Estill5 , Jim Leebens-Mack5 , Seunghee Lee6 , Hong Ma4,11 , Steve Rounsley10 , Stephan Schuster8 , Doug Soltis1 , Pam Soltis1,7 , Lynn Tomsho8 , Sue Wessler5 , Rod Wing6,10 , Yeisoo Yu6 , W. Brad Barbazuk1,2
Amborella trichopoda, as the sister to all other extant angiosperms, occupies a crucial evolutionary position in seed plants, and its genome sequence is an important reference for comparative genomic studies across the angiosperms. It will help in understanding the evolution of key angiosperm traits and provide a baseline to examine genome organization throughout angiosperms. We are using a whole genome shotgun strategy to sequence the ~870-980 Mbp Amborella genome using Roches Genome Sequencer FLX system. To date, 25.5 GS FLX sequencing runs have been analyzed, providing ~14x coverage. Sequence data are being evaluated for quality, coverage, and contamination. Sequence quality is good, with a median Phred score >36 for the first 50 bases of all reads. Sequence coverage is being evaluated by aligning these reads against available Amborella BAC contigs, BAC end sequences, and unigene sequences. We see 98.5% of BAC contig bases covered at least 1x, with a mean coverage of 18.2x. Contamination from bacteria, insects, fungi, humans, and Amborella mitochondria and chloroplast DNA is being checked using BLAST and Mosaik. Consistently ~11% and ~1.9% of the sequences hit Amborella mitochondria and chloroplast genomes, respectively, and less than 1% match non-Amborella sequence. We are starting an initial de novo sequence assembly using Newbler, and will refine the assembly using physical map data from a library of 36,864 end-sequenced BAC clones. As assembled contigs become available, we will annotate them using DAWGPAWS and TWINSCAN. We are developing a GBrowse-based website to share these results with the community.