Protein-Protein Interaction Prediction via Structured Matrix Completion
نویسندگان
چکیده
This paper considers how to computationally predict unknown protein-protein interactions (PPIs) given the experimentally verified PPIs. Matrix completion, a very popular machine learning technique that can be used to to infer the missing part of a matrix, has been introduced to recover the missing interactions of an incomplete PPI network. The benefit of Matrix completion is that it does not rely on unavailable negative samples, which are crucial for existing supervised classification methods. However, current matrix completion solutions for recovering missing PPIs fail to capture the important topology information of the underlying network. That is, the underlying network is a sparse network with skewed degree distribution. In this paper, we design a structured matrix completion method that is suitable for capturing the skewed degree distribution of the underlying true PPI network. Theoretical analysis and extensive experimental results on known PPIs of three species (Plasmodium falciparum, Rattus norvegicus and Caenorhabditis elegans) show that our method outperforms related state-of-the-art protein-protein prediction approaches. We also tested the predicted networks of Plasmodium falciparum in terms of GO similarities. In turns out that our predicted network is with the highest GO score. To further demonstrate the power of our algorithm in predicting new PPIs, we compare the predicted Rattus norvegicus PPI networks using relatively old release from BioGrid with the much newer releases. Comparing with other methods, our predicted networks are much more similar with the newer releases. Our code is available upon request and will be public available in our website after the paper published.
منابع مشابه
Prediction of Protein Sub-Mitochondria Locations Using Protein Interaction Networks
Background: Prediction of the protein localization is among the most important issues in the bioinformatics that is used for the prediction of the proteins in the cells and organelles such as mitochondria. In this study, several machine learning algorithms are applied for the prediction of the intracellular protein locations. These algorithms use the features extracted from pro...
متن کاملPrediction of Coffee Effects in Rats with Healthy and NAFLD Conditions Based on Protein-Protein Interaction Network Analysis
Background and objectives: Non-alcoholic fatty liver disease (NAFLD) is a common liver condition. On the other hand, coffee consumption has shown promising for gastrointestinal diseases. Detection of the most valuable biomarkers of decaffeinated coffee treatment in healthy and non-alcoholic fatty liver disease conditions was the aim of the present study. Methods:</stro...
متن کاملPredicting Protein-Protein Interactions from Multimodal Biological Data Sources via Nonnegative Matrix Tri-Factorization
Protein interactions are central to all the biological processes and structural scaffolds in living organisms, because they orchestrate a number of cellular processes such as metabolic pathways and immunological recognition. Several high-throughput methods, for example, yeast two-hybrid system and mass spectrometry method, can help determine protein interactions, which, however, suffer from hig...
متن کاملDiscovering Domains Mediating Protein Interactions
Background: Protein-protein interactions do not provide any direct information regarding the domains within the proteins that mediate the interactions. The majority of proteins are multi domain proteins and the interaction between them is often defined by the pairs of their domains. Most of the former studies focus only on interacting domain pairs. However they do not consider the in...
متن کاملGraph Convolutional Matrix Completion
We consider matrix completion for recommender systems from the point of view of link prediction on graphs. Interaction data such as movie ratings can be represented by a bipartite user-item graph with labeled edges denoting observed ratings. Building on recent progress in deep learning on graph-structured data, we propose a graph auto-encoder framework based on differentiable message passing on...
متن کامل