Unsupervised Named Entity Classification Models and their Ensembles

نویسندگان

  • Jae-Ho Kim
  • In-Ho Kang
  • Key-Sun Choi
چکیده

This paper proposes an unsupervised learning model for classifying named entities. This model uses a training set, built automatically by means of a small-scale named entity dictionary and an unlabeled corpus. This enables us to classify named entities without the cost for building a large hand-tagged training corpus or a lot of rules. Our model uses the ensemble of three different learning methods and repeats the learning with new training examples generated through the ensemble learning. The ensemble of various learning methods brings a better result than each individual learning method. The experimental result shows 73.16% in precision and 72.98% in recall for Korean news articles.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Meta-Learning Orthographic and Contextual Models for Language Independent Named Entity Recognition

This paper presents a named entity classification system that utilises both orthographic and contextual information. The random subspace method was employed to generate and refine attribute models. Supervised and unsupervised learning techniques used in the recombination of models to produce the final results.

متن کامل

Structured Generative Models for Unsupervised Named-Entity Clustering

We describe a generative model for clustering named entities which also models named entity internal structure, clustering related words by role. The model is entirely unsupervised; it uses features from the named entity itself and its syntactic context, and coreference information from an unsupervised pronoun resolver. The model scores 86% on the MUC-7 named-entity dataset. To our knowledge, t...

متن کامل

Russian Named Entities Recognition and Classification Using Distributed Word and Phrase Representations

The paper presents results on Russian named entities classification and equivalent named entities retrieval using word and phrase representations. It is shown that a word or an expression’s context vector is an efficient feature to be used for predicting the type of a named entity. Distributed word representations are now claimed (and on a reasonable basis) to be one of the most promising distr...

متن کامل

Improvement of Chemical Named Entity Recognition through Sentence-based Random Under-sampling and Classifier Combination

Chemical Named Entity Recognition (NER) is the basic step for consequent information extraction tasks such as named entity resolution, drug-drug interaction discovery, extraction of the names of the molecules and their properties. Improvement in the performance of such systems may affects the quality of the subsequent tasks. Chemical text from which data for named entity recognition is extracte...

متن کامل

Deep Unsupervised Domain Adaptation for Image Classification via Low Rank Representation Learning

Domain adaptation is a powerful technique given a wide amount of labeled data from similar attributes in different domains. In real-world applications, there is a huge number of data but almost more of them are unlabeled. It is effective in image classification where it is expensive and time-consuming to obtain adequate label data. We propose a novel method named DALRRL, which consists of deep ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002