January 12-16, 2008
Town & Country Convention Center
San Diego, CA
Nicole L Quinn1 , Natasha Levenkova2 , Pascal Bouffard2 , William Chow1 , Tom Jarvie2 , Krzysztof P. Lubieniecki1 , Tim Harkins3 , Brian Desany2 , Ben F. Koop4 , William S. Davidson1
The salmonid fishes are of great economic and social importance given that they support commercial and sports fisheries and a large aquaculture industry. In addition, the salmonids are sentinel species for aquatic ecosystems. The current genomic resources for Atlantic salmon include: BAC libraries, a BAC-based physical map, a genetic map with >1,000 markers, >200,000 BAC-end sequences covering ~3.5% of the genome and >436,000 ESTs. These provide a solid foundation for sequencing and annotating the salmon genome. Sanger technology has set the gold standard for sequence quality; however, limitations including cost, labor-intensiveness and speed have fueled the demand for new sequencing approaches. Sequencing projects using new technologies rely heavily on a “guide sequence” to facilitate assembly. The genome of Atlantic salmon is both large (~3x109 bp) and complex (>40% repetitive). This, combined with a lack of a closely related fish reference genome sequence makes sequencing the Atlantic salmon genome challenging. We examined the feasibility of using 454 pyrosequencing to obtain a full sequence of a salmonid genome by sequencing eight pooled BACs belonging to a well-defined minimum tiling path covering ~1Mb. The 454-generated data (average read 248.5 bp) provided ~35x coverage and allowed gene identification. This was enhanced by the addition of 126 BAC-end sequences, which increased the N50 contig size from 11,497 to 13,455 bp. We are currently integrating 454-generated long-read paired-end sequences into the data with the expectation that this will further reduce the number of sequence contigs while increasing their average size.