Figure 1 - Structure of Cathepsin
The eFamily project is designed to integrate the information contained in five of the major protein databases. The member databases are: CATH, SCOP, MSD, InterPro, and Pfam. More details about the individual database can be found under the partners section. Simply, CATH and SCOP contain information about domains based on 3D structural data, while InterPro and Pfam contain information about domains based on protein sequences. The MSD database, as well as the European arm of PDB, is the primary data warehouse for integration of structure and sequence information. Figure 1 is an illustration of the structure of the protein Cathepsin. Figure 2 is an example of a Cathepsin protein sequence alignment. Although these different perspectives on proteins are related it is often difficult for biologists to navigate from protein sequence to protein structure and back again. The aim of this project is to provide the scientific community with a coherent and rich view of protein families that allow users to seamlessly navigate between the worlds of protein structure and protein sequence, by improved data resources and integration. Rather than replicating these five different database as each site, the database can be considered to be in a "Grid" architecture.
Figure 2 - Sequence alignment of Cathepsin
So what are Grids ? Grids are "super Internets" for high-performance computing: worldwide collections of high-end resources - such as supercomputers, storage, advanced instruments and immersive environments. These resources and their users are often separated by great distances and connected by high-speed networks.
Figure 3 - A typical user of eFamily
As the eFamily is a collection of databases, the project is a "Data Grid". As part of the eScience initiative our aim is to use the emerging Grid technologies to make powerful science applications for our users (see figure 3). However, we have to always bear in mind that our users are more often than not laboratory based, with only limited computer experience and facilities. Therefore, lightweight applictions are our goal, with the computer science taken care of within the application. |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Figure 1 - Structure of Cathepsin
Figure 2 - Sequence alignment of Cathepsin
Figure 3 - A typical user of eFamily


