A Hybrid Method for Chinese Entity Relation Extraction
نویسندگان
چکیده
Entity relation extraction is an important task for information extraction, which refers to extracting the relation between two entities from input text. Previous researches usually converted this problem to a sequence labeling problem and used statistical models such as conditional random field model to solve it. This kind of method needs a large, high-quality training dataset. So it has two main drawbacks: 1) for some target relations, it is not difficult to get training instances, but the quality is poor; 2) for some other relations, it is hardly to get enough training data automatically. In this paper, we propose a hybrid method to overcome the shortcomings. To solve the first drawback, we design an improved candidate sentences selecting method which can find out high-quality training instances, and then use them to train our extracting model. To solve the second drawback, we produce heuristic rules to extract entity relations. In the experiment, the candidate sentences selecting method improves the average F1 value by 78.53% and some detailed suggestions are given. And we submitted 364944 triples with the precision rate of 46.3% for the competition of Sougou Chinese entity relation extraction and rank the 4th place in the platform.
منابع مشابه
Chinese Entity Relation Extraction Based on Word Co-occurrence
Chinese entity relation extraction is a part of entity relation extraction. According to entity relation extraction technology and the features of Chinese news corpus, this paper proposes a novel method for Chinese entities relation extraction. The method, named WCORE (word co-occurrence relation extraction), first measures the semantic similarity by word co-occurrence and then adopts pattern m...
متن کاملImproved-Edit-Distance Kernel for Chinese Relation Extraction
In this paper, a novel kernel-based method is presented for the problem of relation extraction between named entities from Chinese texts. The kernel is defined over the original Chinese string representations around particular entities. As a kernel function, the Improved-Edit-Distance (IED) is used to calculate the similarity between two Chinese strings. By employing the Voted Perceptron and Su...
متن کاملTree Kernel-based Relation Extraction with Various Entity-Related Features
This paper proposes a convolution tree kernel-based approach for relation extraction where parse trees are expanded with various entity-related features, such as entity type, subtype, and mention level. Our study indicates that not only can our method effectively capture both syntactic structure and entity information of relation instances in a single tree kernel, but also can avoid the difficu...
متن کاملChunk Parsing and Entity Relation Extracting to Chinese Text by Using Conditional Random Fields Model
Currently, large amounts of information exist in Web sites and various digital media. Most of them are in natural language. They are easy to be browsed, but difficult to be understood by computer. Chunk parsing and entity relation extracting is important work to understanding information semantic in natural language processing. Chunk analysis is a shallow parsing method, and entity relation ext...
متن کاملA Novel Feature-based Approach to Chinese Entity Relation Extraction
Relation extraction is the task of finding semantic relations between two entities from text. In this paper, we propose a novel feature-based Chinese relation extraction approach that explicitly defines and explores nine positional structures between two entities. We also suggest some correction and inference mechanisms based on relation hierarchy and co-reference information etc. The approach ...
متن کامل