Inducing Domain-Specific Semantic Class Taggers from (Almost) Nothing

نویسندگان

  • Ruihong Huang
  • Ellen Riloff
چکیده

This research explores the idea of inducing domain-specific semantic class taggers using only a domain-specific text collection and seed words. The learning process begins by inducing a classifier that only has access to contextual features, forcing it to generalize beyond the seeds. The contextual classifier then labels new instances, to expand and diversify the training set. Next, a cross-category bootstrapping process simultaneously trains a suite of classifiers for multiple semantic classes. The positive instances for one class are used as negative instances for the others in an iterative bootstrapping cycle. We also explore a one-semantic-class-per-discourse heuristic, and use the classifiers to dynamically create semantic features. We evaluate our approach by inducing six semantic taggers from a collection of veterinary medicine message board posts.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Building Domain-Specific Taggers without Annotated (Domain) Data

Part of speech tagging is a fundamental component in many NLP systems. When taggers developed in one domain are used in another domain, the performance can degrade considerably. We present a method for developing taggers for new domains without requiring POS annotated text in the new domain. Our method involves using raw domain text and identifying related words to form a domain specific lexico...

متن کامل

THE ROPER-SUFFRIDGE EXTENSION OPERATORS ON THE CLASS OF STRONG AND ALMOST SPIRALLIKE MAPPINGS OF TYPE $beta$ AND ORDER $alpha$

Let$mathbb{C}^n$ be the space of $n$ complex variables. Let$Omega_{n,p_2,ldots,p_n}$ be a complete Reinhardt on$mathbb{C}^n$. The Minkowski functional on complete Reinhardt$Omega_{n,p_2,ldots,p_n}$ is denoted by $rho(z)$. The concept ofspirallike mapping of type $beta$ and order $alpha$ is defined.So, the concept of the strong and almost spirallike mappings o...

متن کامل

Pseudo-almost valuation rings

The aim of this paper is to generalize the‎‎notion of pseudo-almost valuation domains to arbitrary‎ ‎commutative rings‎. ‎It is shown that the classes of chained rings‎ ‎and pseudo-valuation rings are properly contained in the class of‎ ‎pseudo-almost valuation rings; also the class of pseudo-almost‎ ‎valuation rings is properly contained in the class of quasi-local‎ ‎rings with linearly ordere...

متن کامل

Rapid Adaptation of POS Tagging for Domain Specific Uses

Part-of-speech (POS) tagging is a fundamental component for performing natural language tasks such as parsing, information extraction, and question answering. When POS taggers are trained in one domain and applied in significantly different domains, their performance can degrade dramatically. We present a methodology for rapid adaptation of POS taggers to new domains. Our technique is unsupervi...

متن کامل

Inducing Classes of Terms from Text

This paper describes a clustering method for organizing in semantic classes a list of terms. The experiments were made using a POS annotated corpus, the ACL Anthology, which consists of technical articles in the field of Computational Linguistics. The method, mainly based on some assumptions of Formal Concept Analysis, consists in building bi-dimensional clusters of both terms and their lexico-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010