PAG-XIII  Plant & Animal Genomes XIII Conference

January 15-19, 2005
Town & Country Convention Center
San Diego, CA



P821 : Databases


Technical Challenges In The Integration Of Microarray Data Into A Biological Information Resource

Mary Montoya1 , Neil Miller1 , Danforth Weems1 , Margarita Garcia-Hernandez2 , Nick Moseyko2 , Seung Y Rhee2 , Eva Huala2

1  National Center for Genome Resources, 2935 Rodeo Park Drive East, Santa Fe, NM 87505 USA
2  Carnegie Institution of Washington, Department of Plant Biology, 260 Panama Street, Stanford, CA 94305 USA

As the quantity of available gene expression data grows, biological information resources will increasingly face the task of integrating microarray data into their existing data models and tools. Over the last two years, The Arabidopsis Information Resource (TAIR, http://arabidopsis.org) has devoted a significant amount of effort into storing Arabidopsis gene expression data and integrating it with the other genomic data in the TAIR database. Storing microarray data presents a number of distinct technical challenges. Some of these challenges are due to the complexity of the data which makes for conceptual difficulties in both database design and the establishment of robust data formats for handling exchanges of data. Perhaps more significant, though, are the size and scale of microarray results data, which have significant effects on the hardware and developer resources necessary to handle the data, the methods that can be used to load and manipulate the data, the kinds of searches that can be done, the interfaces required to display the data and the normalization methods that can be used feasibly. We will present these challenges and some of the solutions the TAIR team has employed in confronting them.