January 12-16, 2002
Town & Country Convention Center
San Diego, CA
Bioinformatics: Databases
Short tandem nucleotide repeat (STNR) sequences are commonly found throughout the genomes of higher organisms. Among these, protein binding STNRs in promoters play an important role in regulation of gene expression. For example, of 22 PHO-regulated genes revealed by DNA microarray analysis of the whole yeast genome, the promoter regions of 21 of them contain at least one of two core Pho4p binding sites, CACGTG and CACGTT. Computational tools would be a better alternative to provide a rapid, economical, and an exhaustive means to survey whole plant genomes for genes containing these STNRs. However, currently existing public pattern match tools don't allow researchers to robustly locate the genes whose promoter regions contain the specific STNRs. To provide community a user-friendly tool for this purpose, we have been developing a search engine called ACMES (Advanced Content Matching Engine for Sequences) that provides query methods for finding repetitive sequence. A multi-tree indexing mechanism is implemented in ACMES to achieve both accuracy and efficiency in repetitive sequence retrievals. This indexing mechanism can virtually take query repetitive sequences up to any large number of base pairs. The retrieval results give the researcher the information of the nearest ORF, as well as visualization of the retrieved repetitive sequences in those genes. ACMES also has data mining components that will suggest to the user potential repetitive sequences that are not yet discovered or under study. ACMES will serve as a valuable tool for the study of genes containing specific STNRs which are related to gene functions.