PAG-X  Plant, Animal & Microbe Genomes X Conference

January 12-16, 2002
Town & Country Convention Center
San Diego, CA


Bioinformatics: Databases
            


BUILDING AN INEXPENSIVE BEOWULF COMPUTER CLUSTER AND PRELIMINARY DATA ANALYSIS FROM THE COMPOSITAE GENOME PROJECT

Dean O. Lavelle1 , Alexander Kozik1 , Richard W. Michelmore1

1 University of California, Davis, Dept. of Vegetable Crops, Davis, CA 95616, USA

Gene discovery and mapping candidate genes are two goals of the Compositae Genome Project (CGP). The CGP is focused on two agriculturally important species: Lactuca spp. (lettuce) and Helianthus annus (sunflower). Detailed maps of lettuce and sunflower will be generated using sequence-based, Single Nucleotide Polymorphism (SNP) genetic markers. Syntenic comparisons between lettuce and sunflower and Arabidopsis hold great potential for transferring both genotypic and phenotypic information from Arabidopsis to these understudied crops. For this purpose, one hundred thousand cDNAs will be sequenced from the 5’ end of ESTs, twenty-five thousand from each parent from two mapping populations: wild Lactuca serriola 92G489 x cultivated Lactuca sativa cv. Salinas and Helianthus annus CMS HA89 x wild Helianthus annus ANN-1238. A considerable amount of computation will be required for annotation and SNP discovery. To facilitate this need for computing power, we have built a small, low cost (less than $10,000), multiple (10)-processor cluster of computers utilizing the Scyld Beowulf cluster operating system (http://www.scyld.com). We will describe how we built this inexpensive cluster and how we are utilizing the FASTA program to analyze our sequences. We will present a preliminary analysis of the first 6,000 EST sequences from this project to demonstrate the effectiveness of such a cluster. A web page will be established to assist others who wish to build a similar type of beowulf cluster.


Return to Previous Page or Intl-PAG Homepage