PAG-XIX  Plant & Animal Genomes XIX Conference

January 15-19, 2011
Town & Country Convention Center
San Diego, CA



P038: Genome Sequencing & ESTs


The Amborella Genome Project: Generating A Reference Sequence For Angiosperm Evolutionary Analysis

Srikar Chamala1, 2 , Brandon Walts1,2,9 , Victor Albert3 , Claude dePamphilis4 , Joshua Der4 , James Estill5 , Jim Leebens-Mack5 , Seunghee Lee6 , Hong Ma4,11 , Steve Rounsley10 , Stephan Schuster8 , Doug Soltis1 , Pam Soltis1,7 , Lynn Tomsho8 , Sue Wessler5 , Rod Wing6,10 , Yeisoo Yu6 , W. Brad Barbazuk1,2

1  Department of Biology, University of Florida, Gainesville, FL, 32611, USA
2  Genetics Institute, University of Florida, Gainesville, FL, 32610, USA
3  Department of Biological Sciences, University at Buffalo (SUNY), Buffalo, NY, 14260, USA
4  Intercollege Graduate Degree Program in Plant Biology and Institute of Molecular Evolutionary Genetics, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA
5  Department of Plant Biology, University of Georgia, Athens, GA 30602, USA
6  Arizona Genomics Institute, University of Arizona, Tucson, AZ 85721, USA
7  Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA
8  Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
9  Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL 32610, USA
10  School of Plant Sciences and BIO5 Institute for Collaborative Research, University of Arizona, Tucson, AZ 85721, USA
11  State Key Laboratory of Genetic Engineering and School of Life Sciences, Institute of Plant Biology, Center for Evolutionary Biology, Fudan University, Shanghai 200433, China

Amborella trichopoda, as the sister to all other extant angiosperms, occupies a crucial evolutionary position in seed plants, and its genome sequence is an important reference for comparative genomic studies across the angiosperms. It will help in understanding the evolution of key angiosperm traits and provide a baseline to examine genome organization throughout angiosperms. We are using a whole genome shotgun strategy to sequence the ~870-980 Mbp Amborella genome using Roche’s Genome Sequencer FLX system. To date, 25.5 GS FLX sequencing runs have been analyzed, providing ~14x coverage. Sequence data are being evaluated for quality, coverage, and contamination. Sequence quality is good, with a median Phred score >36 for the first 50 bases of all reads. Sequence coverage is being evaluated by aligning these reads against available Amborella BAC contigs, BAC end sequences, and unigene sequences. We see 98.5% of BAC contig bases covered at least 1x, with a mean coverage of 18.2x. Contamination from bacteria, insects, fungi, humans, and Amborella mitochondria and chloroplast DNA is being checked using BLAST and Mosaik. Consistently ~11% and ~1.9% of the sequences hit Amborella mitochondria and chloroplast genomes, respectively, and less than 1% match non-Amborella sequence. We are starting an initial de novo sequence assembly using Newbler, and will refine the assembly using physical map data from a library of 36,864 end-sequenced BAC clones. As assembled contigs become available, we will annotate them using DAWGPAWS and TWINSCAN. We are developing a GBrowse-based website to share these results with the community.