Book of Abstracts: Albany 2011

category image Albany 2011
Conversation 17
June 14-18 2011
©Adenine Press (2010)

Towards a Spherical Coordinate System Metric for Quantitative Comparison of Protein 3D Structures

Although observed protein structures generally represent energetically favorable conformations that may or may not be "functional", it is also generally agreed that protein structure is closely related to protein function. Given a collection of proteins sharing a common global structure, variations in their local structures at specific, critical locations may result in different biological functions. Structural relationships among proteins are important in the study of the evolution of proteins as well as in drug design and development.

Analysis of geometrical 3D protein structure has been shown to be effective with respect to classifying proteins. Prior work has shown that the double-centroid reduced representation (DCRR) model (1) is a useful geometric representation for protein structure with respect to visual models, reducing the quantity of modeled information for each amino acid, yet retaining the most important geometrical and chemical features of each: the centroids of the backbone and of the side-chain. Thus far, DCRR has not yet been applied in the calculation of geometric structural similarity.

Meanwhile, multi-dimensional indexing (MDI) of protein structure combines protein structural analysis with distance metrics to facilitate structural similarity queries and is also used for clustering protein structures into related groups. In this respect, the combination of geometric models with MDI has been shown to be effective.

Prior work, notably Distance and Density-based Protein Indexing (DDPIn) (2), applies MDI to protein models based on the geometry of the Cα backbone. DDPIn's distance metrics are based on radial and density functions that incorporate spherical-based metrics, and the indices are built from metric tree (M-tree; 3) structures.

This work combines DCRR with DDPIn for the development of new DCRR centroid-based metrics: spherical binning (4) distance and inter-centroid spherical distance. The use of DCRR models will provide additional significant structural information via the inclusion of side-chain centroids. Additionally, the newly developed distance metric functions combined with DCRR and M-tree indexing should improve upon the performance of prior work (DDPIn), given the same data set (5), with respect to both individual k-nearest neighbor search queries as well as clustering all proteins in the index.


  1. V. M. Reyes and V. N. Sheth, In: Handbook of Research in Computational and Systems Biology: Interdisciplinary Approaches, L. A. Liu, D. Wei and Y. Qing (Eds.), Chap. 26 (2011, in press).
  2. D. Hoksza, Proc 6th Ann IEEE Conf Comp Intel Bioinf Comp Biol CIBCB’09, 263–270 (2009).
  3. P. Ciaccia, M. Patella, and P. Zezula, Proc 23rd Intl Conf Very Large Data Bases VLDB’97, 426–435 (1997).
  4. V. M. Reyes, Interdiscipl Sci: Comp Life Sci (2011, in press).
  5. O. Çamoglu, T. Kahveci, and A. K. Singh, Proc IEEE Comp Soc Conf Bioinf CSB’03, 148-158, (2003).

James DeFelice
Vicente M. Reyes

Biological Sciences Dept.
Sch. of Medical & Biological Sciences
College of Science, Rochester Institute of Technology
Rochester, NY 14623-5603 USA

Ph: (585) 475-4115