Unsupervised Named Entity Classification Models and their Ensembles
نویسندگان
چکیده
This paper proposes an unsupervised learning model for classifying named entities. This model uses a training set, built automatically by means of a small-scale named entity dictionary and an unlabeled corpus. This enables us to classify named entities without the cost for building a large hand-tagged training corpus or a lot of rules. Our model uses the ensemble of three different learning methods and repeats the learning with new training examples generated through the ensemble learning. The ensemble of various learning methods brings a better result than each individual learning method. The experimental result shows 73.16% in precision and 72.98% in recall for Korean news articles.
منابع مشابه
Meta-Learning Orthographic and Contextual Models for Language Independent Named Entity Recognition
This paper presents a named entity classification system that utilises both orthographic and contextual information. The random subspace method was employed to generate and refine attribute models. Supervised and unsupervised learning techniques used in the recombination of models to produce the final results.
متن کاملStructured Generative Models for Unsupervised Named-Entity Clustering
We describe a generative model for clustering named entities which also models named entity internal structure, clustering related words by role. The model is entirely unsupervised; it uses features from the named entity itself and its syntactic context, and coreference information from an unsupervised pronoun resolver. The model scores 86% on the MUC-7 named-entity dataset. To our knowledge, t...
متن کاملRussian Named Entities Recognition and Classification Using Distributed Word and Phrase Representations
The paper presents results on Russian named entities classification and equivalent named entities retrieval using word and phrase representations. It is shown that a word or an expression’s context vector is an efficient feature to be used for predicting the type of a named entity. Distributed word representations are now claimed (and on a reasonable basis) to be one of the most promising distr...
متن کاملImprovement of Chemical Named Entity Recognition through Sentence-based Random Under-sampling and Classifier Combination
Chemical Named Entity Recognition (NER) is the basic step for consequent information extraction tasks such as named entity resolution, drug-drug interaction discovery, extraction of the names of the molecules and their properties. Improvement in the performance of such systems may affects the quality of the subsequent tasks. Chemical text from which data for named entity recognition is extracte...
متن کاملDeep Unsupervised Domain Adaptation for Image Classification via Low Rank Representation Learning
Domain adaptation is a powerful technique given a wide amount of labeled data from similar attributes in different domains. In real-world applications, there is a huge number of data but almost more of them are unlabeled. It is effective in image classification where it is expensive and time-consuming to obtain adequate label data. We propose a novel method named DALRRL, which consists of deep ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002