Estimating the strength of unlabeled information during semi-supervised learning

نویسندگان

  • Brenden M. Lake
  • James L. McClelland
چکیده

Semi-supervised category learning is when participants make classification judgements while receiving feedback about the right answers on some trials (labeled stimuli) but not others (unlabeled stimuli). Sporadic feedback is common outside the laboratory, and it is important to understand how people learn in this setting. While there are numerous recent studies, the strength and robustness of semi-supervised learning effects remain unclear, particularly when labeled and unlabeled stimuli are dispersed across learning. We designed an experiment, using simple unidimensional category learning, that allows us to measure the relative contribution of labeled and unlabeled experience. Based on an analysis of this task, we find that an unlabeled stimulus is worth more than 40% of a labeled stimulus.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-Supervised AUC Optimization without Guessing Labels of Unlabeled Data

Semi-supervised learning, which aims to construct learners that automatically exploit the large amount of unlabeled data in addition to the limited labeled data, has been widely applied in many real-world applications. AUC is a well-known performance measure for a learner, and directly optimizing AUC may result in a better prediction performance. Thus, semi-supervised AUC optimization has drawn...

متن کامل

Semi-Supervised Learning Based Prediction of Musculoskeletal Disorder Risk

This study explores a semi-supervised classification approach using random forest as a base classifier to classify the low-back disorders (LBDs) risk associated with the industrial jobs. Semi-supervised classification approach uses unlabeled data together with the small number of labelled data to create a better classifier. The results obtained by the proposed approach are compared with those o...

متن کامل

Estimate Unlabeled-Data-Distribution for Semi-supervised PU Learning

Traditional supervised classifiers use only labeled data (features/label pairs) as the training set, while the unlabeled data is used as the testing set. In practice, it is often the case that the labeled data is hard to obtain and the unlabeled data contains the instances that belong to the predefined class beyond the labeled data categories. This problem has been widely studied in recent year...

متن کامل

Confidence Estimation for Graph-based Semi-supervised Learning

To select unlabeled example effectively and reduce classification error, confidence estimation for graphbased semi-supervised learning (CEGSL) is proposed. This algorithm combines graph-based semi-supervised learning with collaboration-training. It makes use of structure information of sample to calculate the classification probability of unlabeled example explicitly. With multi-classifiers, th...

متن کامل

Constraint-Driven Rank-Based Learning for Information Extraction

Most learning algorithms for factor graphs require complete inference over the dataset or an instance before making an update to the parameters. SampleRank is a rank-based learning framework that alleviates this problem by updating the parameters during inference. Most semi-supervised learning algorithms also rely on the complete inference, i.e. calculating expectations or MAP configurations. W...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011