Towards Confidence Estimation for Typed Protein-Protein Relation Extraction
نویسندگان
چکیده
Systems which build on top of information extraction are typically challenged to extract knowledge that, while correct, is not yet well-known. We hypothesize that a good confidence measure for relational information has the property that such interesting information is found between information extracted with very high confidence and very low confidence. We discuss confidence estimation for the domain of biomedical protein-protein relation discovery in biomedical literature. As facts reported in papers take some time to be validated and recorded in biomedical databases, such task gives rise to large quantities of unknown but potentially true candidate relations. It is thus important to rank them based on supporting evidence rather than discard them. In this paper, we discuss this task and propose different approaches for confidence estimation and a pipeline to evaluate such methods. We show that the most straight-forward approach, a combination of different confidence measures from pipeline modules seems not to work well. We discuss this negative result and pinpoint potential future research directions.
منابع مشابه
Estimation of the Amount of Recombinant Protein A Secretion Using Fuzzy Regression
Abstract Background and purpose: Since protein A is considered an important protein from medical, medicinal, genetic engineering, and biotechnology point of view, the present study attempted to investigate and determine to what extent protein A is produced through regression, in addition to the production conditions of the protein. Thus, a figure was introduced as for the estimation of the a...
متن کاملLiterature mining of protein-residue associations with graph rules learned through distant supervision
BACKGROUND We propose a method for automatic extraction of protein-specific residue mentions from the biomedical literature. The method searches text for mentions of amino acids at specific sequence positions and attempts to correctly associate each mention with a protein also named in the text. The methods presented in this work will enable improved protein functional site extraction from arti...
متن کاملOptimum conditions for protein extraction from tuna processing by-products using isoelectric solubilization and precipitation processes
The by-product from tuna processing is a potential source of edible protein. Therefore, it is very important to extract protein from such raw materials for human food. In this study the optimum pH for protein extraction from tuna by-products was optimized by using isoelectric solubilization and precipitation processes. The Response Surface Methodology (RSM) and the single factor model were used...
متن کاملOptimum conditions for protein extraction from tuna processing by-products using isoelectric solubilization and precipitation processes
The by-product from tuna processing is a potential source of edible protein. Therefore, it is very important to extract protein from such raw materials for human food. In this study the optimum pH for protein extraction from tuna by-products was optimized by using isoelectric solubilization and precipitation processes. The Response Surface Methodology (RSM) and the single factor model were used...
متن کاملRuminal Protein Degradation and Estimation of Rumen Microbial Protein Production
Animal agricultural production systems are major sources of nonpoint pollution affecting quality of water sources. Nitrogen has been identified as the foremost source of nonpoint water pollution and the potential negative impacts of N have become an area of public concern. protein degradation from feed ingredients is an important factorinfluencing AA supply to the duodenum. Ruminal proteolysis...
متن کامل