COMS 6998 - 4 Fall 2017 Presenter : Geelon So
نویسنده
چکیده
Recall that in the active learning setting, learner is provided with unlabeled samples, and can query teacher for the label. The goal is to learn a concept close enough to the target concept while minimizing the number of labels queried. Ideally the number of labels needed is much smaller than Ω(1/ ), which is the number of labeled examples in the passive learning setting. To characterize the sample complexity, in this lecture we discussed another quantity to measure the effectiveness of active learning on particular concept classes and distributions: splitting index. We provide motivating examples, definition of splitting index, and a (coarse) lower and upper bound for label complexity based on it.
منابع مشابه
COMS 6998 - 4 Fall 2017 Presenter : Geelon So
In the setting of active learning, the data comes unlabeled and querying the label of a data point is expensive. The goal of an active learner is to reduce the number of labels needed and output a hypothesis with error rate ≤ . Recall that the usual sample complexity of supervised learning is Ω(1/ ). The motivation for defining splitting index is to characterize the sample complexity of active ...
متن کاملCOMS 6998 - 4 Fall 2017 Presenter : Daniel Hsu
+ errPn(hn)− errPn(h) + errPn(h ∗)− errP (h∗). The second part is less than or equals to 0. We can disregard it when we aim at deriving an upper bound of the regret. Since the target function h∗ is independent of the sample pairs, the third part can be bounded easily by analyzing the binomial distribution with success probability errP (h∗) and n trials. To analyze the remaining first part, we b...
متن کاملCOMS 6998 - 4 Fall 2017 Presenter : Yanlin
In former lectures, we have learned a lot about online learning. The basic idea is to keep a subset of hypothesis space as the version space and reduce the version space by new data or queries. And we consider the data in any arbitrary form, which means we don’t have any specific hypothesis on data’s schema itself. Although it can be generalized easily, still we want to make a practical effort ...
متن کاملCOMS 6998 - 4 Fall 2017 Presenter : Wenxi Chen
This lecture is delivered in more philosophical sense rather than technically. In the past, we have learned a series of algorithms which can dig deeper in a training dataset and generate a model based on such a dataset. However, without knowing of what knowledge is contained in the model, it is generally hard for human beings to trust the trained model and apply it in reality. Sometimes, biased...
متن کامل