PAG-XIV  Plant & Animal Genomes XIV Conference

January 14-18, 2006
Town & Country Convention Center
San Diego, CA



Poster: Genome Sequencing & ESTs


P13

Analysis Of The Complete Chloroplast Genome Of Gossypium hirsutum

C Kaittanis1 , S-B Lee1 , J Hostetler2 , C D Town2 , J Tallon2 , R K Jansen3 , H Daniell1

1  University of Central Florida, Dept. of Molecular Biology & Microbiology, Biomolecular Science, Building #20, Orlando, FL 32816-2364
2  The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, MD 20850
3  Section of Integrative Biology and Institute of Cellular and Molecular Biology, Patterson Laboratories 141, University of Texas, Austin TX 78712

Cotton is a major agronomical crop with an annual retail value of about $120 billion. The complete cotton chloroplast genome is 160,301 bp, with a pair of 25,608 bp-long inverted repeats, separated by a small and a large single copy region of 20,269 bp and 88,816 bp respectively. There are 30 direct repeats and 24 palindromes, 30 bp or longer with a sequence identity greater than or equal to 90%. Most of the direct repeats are within intergenic spacer regions, intron sequences and ycf2. Interestingly, a 72 bp-long direct repeat is found in the psaA and psaB exons, whereas a shorter 32 bp-long direct repeat was identified in trnS-GCU and trnS-UGA. The cotton chloroplast genome contains 112 unique genes and 19 duplicated genes within the inverted repeat regions, giving a total of 131 genes. There are 4 ribosomal RNA and 30 distinct tRNA genes, 7 of the latter are duplicated in the inverted repeat region. There are 17 intron-containing genes, where 15 genes contain one intron, whereas the remaining two have two introns. The gene order in cotton is identical to that of tobacco but lacks rpl22 and the pseudogene infA. Comparison of protein coding sequences with expressed sequence tags (ESTs) identified that rps16, rpl2, rpoC2, rps4 and ycf1 have 100% sequence identity with their respective ESTs. However, nucleotide substitutions resulting in amino-acid changes were identified for ndhC, rpl23, rpl20, rps3 and clpP compared to the ESTs, although their sequence identity was above 98%. Finally, phylogenetic analysis of the protein coding genes are being performed to assess the relationship of cotton to the 30 other sequenced angiosperm chloroplast genomes.