Poster: Bioinformatics
P03_01.html
The SWISS-PROT group at the European Bioinformatics Institute provides a suite of key resources for the analysis of completed proteomes. Sequences: Our trio of protein sequence databases - SWISS-PROT, TrEMBL and TrEMBLnew - covers all publicly known protein sequences grouped by their annotation status. All incoming data goes to TrEMBLnew, is cleaned from redundancy, automatically annotated and stored in TrEMBL, manually annotated, triple-checked and released as finished entry into SWISS-PROT. Families and domains: In collaboration with PROSITE, PRINTS, Pfam and ProDom, we provide InterPro, an annotated resource of protein families, domains and functional sites. The included recognition patterns are increasingly used to analyse newly sequenced genomes. Clusters: Based on pairwise sequence similarity, we launched a database of clusters of SWISS-PROT + TrEMBL proteins, abbreviated CluSTr. As of October 2000, it covers all plant proteins and the three completed eukaryotic proteomes. Recognising that some protein families are strictly conserved while others are not, CluSTr was designed as a multi level database where the user can pick the required stringency. Proteome analysis pages: As the number of completely sequenced genomes is growing with an ever increasing speed, we set up a collection of web pages for each finished organism. They provide easy access to the complete set of protein sequences, statistics on primary, secondary and tertiary structure data, InterPro based family and domain information, a list of interesting clusters, a rough functional classification based on the GeneOntology, and a tool to compare each organism to each other finished proteome. All data is available at the EBI's website www.ebi.ac.uk.