Active Nearest Neighbors in Changing Environments
نویسندگان
چکیده
While classic machine learning paradigms assume training and test data are generated from the same process, domain adaptation addresses the more realistic setting in which the learner has large quantities of labeled data from some source task but limited or no labeled data from the target task it is attempting to learn. In this work, we give the first formal analysis showing that using active learning for domain adaptation yields a way to address the statistical challenges inherent in this setting. We propose a novel nonparametric algorithm, ANDA, that combines an active nearest neighbor querying strategy with nearest neighbor prediction. We provide analyses of its querying behavior and of finite sample convergence rates of the resulting classifier under covariate shift. Our experiments show that ANDA successfully corrects for dataset bias in multiclass image categorization.
منابع مشابه
A Novel Hybrid Approach for Email Spam Detection based on Scatter Search Algorithm and K-Nearest Neighbors
Because cyberspace and Internet predominate in the life of users, in addition to business opportunities and time reductions, threats like information theft, penetration into systems, etc. are included in the field of hardware and software. Security is the top priority to prevent a cyber-attack that users should initially be detecting the type of attacks because virtual environments are not moni...
متن کاملActive Nearest Neighbors in Changing Environments: Supplementary Material
We adapt the proof (guided exercise) of Theorem 19.5 in (Shalev-Shwartz & Ben-David, 2014) to our setting. As is done there, we use the notation y ∼ p to denote drawing from a Bernoulli random variable with mean p. We will employ the following lemmas: Lemma 1 (Lemma 19.6 in (Shalev-Shwartz & Ben-David, 2014)). Let C1, . . . , Cr be a collection of subsets of some domain set, X . Let S be a sequ...
متن کاملThe Performance of small samples in quantifying structure central Zagros forests utilizing the indexes based on the nearest neighbors
Abstract Todaychr('39')s forest structure issue has converted to one of the main ecological debates in forest science. Determination of forest structure characteristics is necessary to investigate stands changing process, for silviculture interventions and revival operations planning. In order to investigate structure of the part of Ghale-Gol forests in Khorramabad, a set of indices such as Cla...
متن کاملAn Enhancement of k-Nearest Neighbor Classification Using Genetic Algorithm
K-Nearest Neighbor Classification (kNNC) makes the classification by getting votes of the k-Nearest Neighbors. Performance of kNNC is depended largely upon the efficient selection of k-Nearest Neighbors. All the attributes describing an instance does not have same importance in selecting the nearest neighbors. In real world, influence of the different attributes on the classification keeps on c...
متن کاملComparison and evaluation of the performance of data-driven models for estimating suspended sediment downstream of Doroodzan Dam
Dams control most of the sediment entering the reservoir by creating static environments. However, sediment leaving the dam depends on various factors such as dam management method, inlet sediment, water height in the reservoir, the shape of the reservoir, and discharge flow. In this research, the amount of suspended sediment of Doroodzan Dam based on a statistical period of 25 years has been i...
متن کامل