doi:10.1038/npre.2009.3103.1
1 vote

The MEROPS Database

Neil D. Rawlings1 and Alan J. Barrett1

Correspondence: (Login to view email address)

  1. Wellcome Trust Sanger Institute
Document Type:
Poster
Date:
Received 20 April 2009 13:39 UTC; Posted 20 April 2009
Subjects:
Bioinformatics
Tags:
Abstract:

Many proteins undergo important post-translational proteolytic processing to remove targeting signals and activation peptides, and most proteins undergo proteolytic inactivation and catabolism. The enzymes that hydrolyse the peptide bonds in proteins and peptides are known as peptidases, proteases or proteolytic enzymes. The MEROPS database (http://merops.sanger.ac.uk) presents the classification and nomenclature of peptidases, their inhibitors and substrates. In 1993 we proposed the scheme for the classification of peptidases that has been internationally accepted, and in 1996 we established the MEROPS database. Protein inhibitors have been included in the database since 2004. About 2% of the genes in a genome encode peptidase homologues, and a further 1% encode protein inhibitors. For example, the human genome has 1037 genes encoding peptidase homologues (of which 643 are known or predicted to be active peptidases) and 433 protein inhibitor genes (of which 144 have been biochemically characterized as inhibitors).

The MEROPS classification is hierarchical. Sequences are grouped into a peptidase species (each of which is given a unique identifier, for example C01.060 for cathepsin B); peptidase species are grouped into a family (for example C1); and families grouped into a clan (for example CA). To be included in the same protein species, sequences must be derived from the same node on a dendrogram derived from the family sequence alignment and known (or predicted) to share similar specificity. To be included in the same family sequences must be homologous over the sequence domain that contains the active site residues (peptidases) or reactive site (inhibitors). To be included in the same clan, the proteins must share similar tertiary structures (or the same linear arrangement of active site residues if the structure is unknown). Over 117,000 peptidase homologues are classified into 3114 protein species, 205 families and 52 clans, and 12,104 protein inhibitors are classified into 663 protein species, 64 families and 33 clans.

The database includes manually curated summaries for each clan, family and protein species. There are also sequence alignments and manually curated bibliographies (with over 41,000 references) at every level. In addition to protein inhibitors we also include 158 manually curated summaries for synthetic and naturally occurring small molecule inhibitors. There is also a summary page for each organism listing all known homologues and an analysis highlighting significant presences, absences or gene family expansions for organisms with a completely sequenced genome.

The MEROPS database includes known peptidase substrates: naturally occurring peptides and proteins, and synthetic substrates. Currently there are 4091 cleavages of synthetic substrates and 95,413 cleavages of proteins (of which 74,740 are physiological). Cleavages in proteins are mapped to UniProt entries. An alignment of very close homologues of each substrate sequence is shown, highlighting residues around each cleavage site indicating whether the peptidase is known to accept the amino acid at that position or not. Cleavage sites that are conserved are likely to be physiological; cleavage sites that are not conserved may be pathological for the species in which they occur or coincidental.

The MEROPS data is freely available to download from our FTP site (http://ftp.sanger.ac.uk/pub/MEROPS) and via our Distributed Annotation System (DAS) server (http://das.sanger.ac.uk/das/merops).

Collection:
3rd International Biocuration Conference
Presented at:
3rd International Biocuration Conference, 16 April 2009

Discussion

Votes:

1 vote

(Login to vote)

Comments:

0 comments

(Login to post a comment)

(Login to share with a colleague)

Additional information

License:
This document is licensed to the public under the Creative Commons Attribution 3.0 License
How to cite this document:

Rawlings, Neil and Barrett, Alan. The MEROPS Database. Available from Nature Precedings <http://dx.doi.org/10.1038/npre.2009.3103.1> (2009)

Version info:

Other versions of this document in Nature Precedings

None.

Other versions of this document elsewhere on the web

None known.

Participate

Related Documents

Advertisement