MOA Concept Drift Active Learning Strategies for Streaming Data
نویسندگان
چکیده
We present a framework for active learning on evolving data streams, as an extension to the MOA system. In learning to classify streaming data, obtaining the true labels may require major effort and may incur excessive cost. Active learning focuses on learning an accurate model with as few labels as possible. Streaming data poses additional challenges for active learning, since the data distribution may change over time (concept drift) and classifiers need to adapt. Conventional active learning strategies concentrate on querying the most uncertain instances, which are typically concentrated around the decision boundary. If changes do not occur close to the boundary, they will be missed and classifiers will fail to adapt. We propose a software system that implements active learning strategies, extending the MOA framework. This software is released under the GNU GPL license.
منابع مشابه
Handling adversarial concept drift in streaming data
Classifiers operating in a dynamic, real world environment, are vulnerable to adversarial activity, which causes the data distribution to change over time. These changes are traditionally referred to as concept drift, and several approaches have been developed in literature to deal with the problem of drift handling and detection. However, most concept drift handling techniques, approach it as ...
متن کاملDetecting Concept Drift in Data Stream Using Semi-Supervised Classification
Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...
متن کاملActive Learning for Data Streams Under Concept Drift and Concept Evolution
Data streams classification is an important problem however, poses many challenges. Since the length of the data is theoretically infinite, it is impractical to store and process all the historical data. Data streams also experience change of its underlying distribution (concept drift), thus the classifier must adapt. Another challenge of data stream classification is the possible emergence and...
متن کاملAdaptive Learning Rate for Online Linear Discriminant Classifiers
We propose a strategy for updating the learning rate parameter of online linear classifiers for streaming data with concept drift. The change in the learning rate is guided by the change in a running estimate of the classification error. In addition, we propose an online version of the standard linear discriminant classifier (O-LDC) in which the inverse of the common covariance matrix is update...
متن کاملEmpirical Comparison of Active Learning Strategies for Handling Temporal Drift
Active learning strategies often assume that the target concept will remain stationary over time. However, in many real world systems, it is not uncommon for the target concept and distribution properties of the generated data to change over time. This paper presents an empirical study that evaluates the effectiveness of using active learning strategies to train statistical models in the presen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011