Integration of multiple biological features yields high confidence human protein interactome

Karagoz K., Sevimoglu T., ARĞA K. Y.

JOURNAL OF THEORETICAL BIOLOGY, cilt.403, ss.85-96, 2016 (SCI İndekslerine Giren Dergi) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 403
  • Basım Tarihi: 2016
  • Doi Numarası: 10.1016/j.jtbi.2016.05.020
  • Sayfa Sayıları: ss.85-96


The biological function of a protein is usually determined by its physical interaction with other proteins. Protein-protein interactions (PPIs) are identified through various experimental methods and are stored in curated databases. The noisiness of the existing PPI data is evident, and it is essential that a more reliable data is generated. Furthermore, the selection of a set of PPIs at different confidence levels might be necessary for many studies. Although different methodologies were introduced to evaluate the confidence scores for binary interactions, a highly reliable, almost complete PPI network of Homo sapiens is not proposed yet. The quality and coverage of human protein interactome need to be improved to be used in various disciplines, especially in biomedicine. In the present work, we propose an unsupervised statistical approach to assign confidence scores to PPIs of H. sapiens. To achieve this goal PPI data from six different databases were collected and a total of 295,288 non-redundant interactions between 15,950 proteins were acquired. The present scoring system included the context information that was assigned to PPIs derived from eight biological attributes. A high confidence network, which included 147,923 binary interactions between 13,213 proteins, had scores greater than the cutoff value of 0.80, for which sensitivity, specificity, and coverage were 94.5%, 80.9%, and 82.8%, respectively. We compared the present scoring method with others for evaluation. Reducing the noise inherent in experimental PPIs via our scoring scheme increased the accuracy significantly. As it was demonstrated through the assessment of process and cancer subnetworks, this study allows researchers to construct and analyze context-specific networks via valid PPI sets and one can easily achieve subnetworks around proteins of interest at a specified confidence level. (C) 2016 Elsevier Ltd. All rights reserved.