 |
The CATH database is novel hierarchical classification of protein
domain structures, which clusters proteins at four major levels,
Class(C), Architecture(A), Topology(T) and Homologous superfamily
(H). Class, derived from secondary structure content, is assigned for
more than 90% of protein structures automatically. Architecture, which
describes the gross orientation of secondary structures, independent
of connectivities, is currently assigned manually. The topology level
clusters structures according to their toplogical connections and
numbers of secondary structures. The homologous superfamilies cluster
proteins with highly similar structures and functions. The assignments
of structures to toplogy families and homologous superfamilies are
made by sequence and structure comparisons.
 |
InterPro is a database of protein families, domains and functional
sites in which identifiable features found in known proteins can be
applied to unknown protein sequences.
 |
The Macromolecular Structure Database (MSD) is a collection,
management and distribution of data about macromolecular structures,
derived in part from the Protein Data Bank (PDB). MSD also provides a
comprehensive mapping between protein sequences in UniPort and protein
structures in the database.
Pfam is a database of multiple sequence alignments and hidden
Markov models. There are two parts to Pfam, termed Pfam-A and
Pfam-B. Pfam-A contains over 7,500 high quality, manually curated
protein families. Associated with each family is a description of the
family and appropriate links to other databases. The other part,
Pfam-B, is derived from prodom and represents sequences clusters that
are not covered by Pfam-A regions. Together, Pfam-A and Pfam-B covers
approximately 95% of the UniProt database.
The SCOP (Structural Classfication of Proteins) database is developed as
an evolutionary classification, in which the main focus is to place the
proteins in a coherent evolutionary framework, based on their conserved
structural features. The database aims to provide a comprehensive and
detailed description of the relationships between all proteins whose 3D
structures have been determined.
|