نتایج جستجو برای: test semi

تعداد نتایج: 943546  

Journal: :JCP 2011
Hai-jiang He

In this paper, we propose a co-ranking algorithm that trains listwise ranking functions using unlabeled data simultaneously with a small number of labeled data. The coranking algorithm is based on the co-training paradigm that is a very common scheme in the semi-supervised classification framework. First, we use two listwise ranking methods to construct base ranker and assistant ranker, respect...

2007
Bernhard Pfahringer Claire Leschi Peter Reutemann

Domains like text classification can easily supply large amounts of unlabeled data, but labeling itself is expensive. Semi-supervised learning tries to exploit this abundance of unlabeled training data to improve classification. Unfortunately most of the theoretically well-founded algorithms that have been described in recent years are cubic or worse in the total number of both labeled and unla...

Journal: :EURASIP J. Audio, Speech and Music Processing 2009
Christophe Lévy Georges Linarès Jean-François Bonastre

Speech recognition applications are known to require a significant amount of resources. However, embedded speech recognition only authorizes few KB of memory, few MIPS and small a amount of training data. In order to fit the resource constraints of embedded applications, an approach based on a semi-continuous HMM system using stateindependent acoustic modelling is proposed. A transformation is ...

Journal: :CoRR 2017
Gal Hyams Daniel Greenfeld Dor Bank

It is well known that for some tasks, labeled data sets may be hard to gather. Self-training, or pseudo-labeling, tackles the problem of having insufficient training data. In the self-training scheme, the classifier is first trained on a limited, labeled dataset, and after that, it is trained on an additional, unlabeled dataset, using its own predictions as labels, provided those predictions ar...

2011
Shasha Liao Ralph Grishman

Annotating training data for event extraction is tedious and labor-intensive. Most current event extraction tasks rely on hundreds of annotated documents, but this is often not enough. In this paper, we present a novel self-training strategy, which uses Information Retrieval (IR) to collect a cluster of related documents as the resource for bootstrapping. Also, based on the particular character...

2009
Weiwei Du Kiichi Urahama

We present a spectral mapping technique for semisupervised pattern classification. Importance scores of features are firstly evaluated with a semi-supervised feature selection algorithm by Zhao et al. Training data are then embedded into a low-dimensional space with a spectral mapping derived from the selected and weighted feature vectors with which test data are classified by the nearest neigh...

2014
Tieyun Qian Bing Liu Li Chen Zhiyong Peng

Authorship attribution (AA) aims to identify the authors of a set of documents. Traditional studies in this area often assume that there are a large set of labeled documents available for training. However, in the real life, it is often difficult or expensive to collect a large set of labeled data. For example, in the online review domain, most reviewers (authors) only write a few reviews, whic...

2009
Clay Woolam Mohammad M. Masud Latifur Khan

This paper outlines a data stream classification technique that addresses the problem of insufficient and biased labeled data. It is practical to assume that only a small fraction of instances in the stream are labeled. A more practical assumption would be that the labeled data may not be independently distributed among all training documents. How can we ensure that a good classification model ...

2005
André Hernich Arfst Nickelsen

A partial information algorithm for a language A computes, for some fixed m, for input words x1, . . . , xm a set of bitstrings containing χA(x1, . . . , xm). E.g., p-selective, approximable, and easily countable languages are defined by the existence of polynomial-time partial information algorithms of specific type. Self-reducible languages, for different types of self-reductions, form subcla...

2014
Sameh Khamis Christoph H. Lampert

In this work we introduce a new approach to co-classification, i.e. the task of jointly classifying multiple, otherwise independent, data samples. The method we present, named CoConut, is based on the idea of adding a regularizer in the label space to encode certain priors on the resulting labelings. A regularizer that encourages labelings that are smooth across the test set, for instance, can ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید