Weakly Supervised Natural Language Learning Without Redundant Views

نویسندگان

  • Vincent Ng
  • Claire Cardie
چکیده

We investigate single-view algorithms as an alternative to multi-view algorithms for weakly supervised learning for natural language processing tasks without a natural feature split. In particular, we apply co-training, self-training, and EM to one such task and find that both selftraining and FS-EM, a new variation of EM that incorporates feature selection, outperform cotraining and are comparatively less sensitive to parameter changes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Limitations of Co-Training for Natural Language Learning from Large Datasets

Co-Training is a weakly supervised learning paradigm in which the redundancy of the learning task is captured by training two classifiers using separate views of the same data. This enables bootstrapping from a small set of labeled training data via a large set of unlabeled data. This study examines the learning behavior of co-training on natural language processing tasks that typically require...

متن کامل

Jointly Learning to Parse and Perceive: Connecting Natural Language to the Physical World

This paper introduces Logical Semantics with Perception (LSP), a model for grounded language acquisition that learns to map natural language statements to their referents in a physical environment. For example, given an image, LSP can map the statement “blue mug on the table” to the set of image segments showing blue mugs on tables. LSP learns physical representations for both categorical (“blu...

متن کامل

Language Segmentation

Language segmentation consists in finding the boundaries where one language ends and another language begins in a text wrien in more than one language. is is important for all natural language processing tasks. e problem can be solved by training language models on language data. However, in the case of lowor no-resource languages, this is problematic. I therefore investigate whether unsuper...

متن کامل

Learning Embedded Discourse Mechanisms for Information Extraction

We address the problem of learning discourse-level merging strategies within the context of a natural language information extraction system. While we report on work currently in progress, results of preliminary experiments employing classification tree learning, maximum entropy modeling, and clustering methods are described. We also discuss motivations for moving away from supervised methods a...

متن کامل

Knowledge Aided Consistency for Weakly Supervised Phrase Grounding

Given a natural language query, a phrase grounding system aims to localize mentioned objects in an image. In weakly supervised scenario, mapping between image regions (i.e., proposals) and language is not available in the training set. Previous methods address this deficiency by training a grounding system via learning to reconstruct language information contained in input queries from predicte...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003