Inducing Domain-Specific Semantic Class Taggers from (Almost) Nothing
نویسندگان
چکیده
This research explores the idea of inducing domain-specific semantic class taggers using only a domain-specific text collection and seed words. The learning process begins by inducing a classifier that only has access to contextual features, forcing it to generalize beyond the seeds. The contextual classifier then labels new instances, to expand and diversify the training set. Next, a cross-category bootstrapping process simultaneously trains a suite of classifiers for multiple semantic classes. The positive instances for one class are used as negative instances for the others in an iterative bootstrapping cycle. We also explore a one-semantic-class-per-discourse heuristic, and use the classifiers to dynamically create semantic features. We evaluate our approach by inducing six semantic taggers from a collection of veterinary medicine message board posts.
منابع مشابه
Building Domain-Specific Taggers without Annotated (Domain) Data
Part of speech tagging is a fundamental component in many NLP systems. When taggers developed in one domain are used in another domain, the performance can degrade considerably. We present a method for developing taggers for new domains without requiring POS annotated text in the new domain. Our method involves using raw domain text and identifying related words to form a domain specific lexico...
متن کاملTHE ROPER-SUFFRIDGE EXTENSION OPERATORS ON THE CLASS OF STRONG AND ALMOST SPIRALLIKE MAPPINGS OF TYPE $beta$ AND ORDER $alpha$
Let$mathbb{C}^n$ be the space of $n$ complex variables. Let$Omega_{n,p_2,ldots,p_n}$ be a complete Reinhardt on$mathbb{C}^n$. The Minkowski functional on complete Reinhardt$Omega_{n,p_2,ldots,p_n}$ is denoted by $rho(z)$. The concept ofspirallike mapping of type $beta$ and order $alpha$ is defined.So, the concept of the strong and almost spirallike mappings o...
متن کاملPseudo-almost valuation rings
The aim of this paper is to generalize thenotion of pseudo-almost valuation domains to arbitrary commutative rings. It is shown that the classes of chained rings and pseudo-valuation rings are properly contained in the class of pseudo-almost valuation rings; also the class of pseudo-almost valuation rings is properly contained in the class of quasi-local rings with linearly ordere...
متن کاملRapid Adaptation of POS Tagging for Domain Specific Uses
Part-of-speech (POS) tagging is a fundamental component for performing natural language tasks such as parsing, information extraction, and question answering. When POS taggers are trained in one domain and applied in significantly different domains, their performance can degrade dramatically. We present a methodology for rapid adaptation of POS taggers to new domains. Our technique is unsupervi...
متن کاملInducing Classes of Terms from Text
This paper describes a clustering method for organizing in semantic classes a list of terms. The experiments were made using a POS annotated corpus, the ACL Anthology, which consists of technical articles in the field of Computational Linguistics. The method, mainly based on some assumptions of Formal Concept Analysis, consists in building bi-dimensional clusters of both terms and their lexico-...
متن کامل