PAG-XI  Plant & Animal Genomes XI Conference

January 11-15, 2003
Town & Country Convention Center
San Diego, CA


Bioinformatics: Databases
           Computer: Poster and Demo


P886

LETTUCE/SUNFLOWER EST CGPDB PROJECT. DATA ANALYSIS, ASSEMBLY VISUALIZATION AND VALIDATION.

Alexander Kozik , Brian Chan , Richard Michelmore

Department of Vegetable Crops, University of California at Davis, CA 95616.

Over 60,000 lettuce and 40,000 sunflower ESTs from multiple libraries have been assembled using CAP3 program and organized into the Compositae Genome Project database (http://cgpdb.ucdavis.edu/). This assembly represents about 19,000 lettuce and 12,000 sunflower unigenes. mySQL (http://www.mysql.com/) was chosen as an efficient tool to manage the data. Custom PHP and Python programs were developed with publicly available php_my_admin software to manipulate the data and visualize the assemblies. To exploit the generation of the ESTs from different genotypes representing mapping parents of lettuce and sunflower, we implemented a new approach to discover possible polymorphisms. About 250 insertions/deletions (INDELs) and 2,500 substitutions (SNPs) have been discovered for lettuce and sunflower assemblies using custom Python scripts. Wet lab experiment confirmed the predicted polymorphism in ~90% cases. A new clustering algorithm was used to find putative COS (conserved ortholog set) markers. About 1,200 lettuce and 500 sunflower putative COS markers have been identified based on clustering analysis with the complete Arabidopsis genome. EST assemblies have been analyzed for multidomain proteins, possible chimeric clones and misassembled contigs using graph theory and custom Graph9 program. Clusters of multigene families have been visualized using PhyloGrapher program (http://cgpdb.ucdavis.edu/PhyloGrapher/). The decisions underlying our strategy of using existing bioinformatics software along with developing new programs to analyze the lettuce/sunflower EST collection will be discussed in detail.


Return to Previous Page or Intl-PAG Homepage