January 14-18, 2006
Town & Country Convention Center
San Diego, CA
Karen F Barlow , Sarah Sims , Christine Nicholson , Sean Humphray
The Wellcome Trust Sanger Institute has committed to sequence and finish the gene rich region on chromosome 4 of the tomato. The gene rich regions are estimated to comprise 19Mb of sequence and we expect this to be represented in approximately 193 BACs. The BAC clones will be processed through our sequencing pipeline, which will be described.
A tile path of minimally overlapping clones will be selected from the fingerprint map of the genome (see C. Nicholson, this meeting). DNA is prepared from each clone streaked to a single colony, and, after checking the fingerprint digest of the DNA for clone identity and integrity, a shotgun library of 2-4 kb fragments is produced in pUC19 from each BAC clone. The plasmid DNA is then prepared using a semi-automated alkaline lysis procedure and sequenced with Applied Biosystems Big Dye Terminator chemistry. The sequence is analysed on AB 3730 automated sequencing instruments. After generating 6-8 fold sequence coverage, the sequence is assembled and the clones undergo an automated round of primer walking to extend and join contigs, before being re-assembled with PHRAP (P. Green) and passing into a directed finishing stage.
Finishers use GAP4 (R. Staden) and a suite of customised software tools to view and edit data in order to correctly re-assemble the clone sequence. This software is also used to check the quality of the assembled regions prior to their final submission. Clones are tracked throughout the sequencing pipeline in an Oracle database. Assembled sequences are submitted to the htgs division of EMBL/Genbank/DDBJ and the finished sequence is deposited in EMBL.