A Novel method of Data Stream Classification Based on Incremental Storage Tree

نویسنده

  • Chonghuan Xu
چکیده

For the characteristics of large number, fast change, high cost of random access of data stream, this paper proposes a Bayesian classification data mining algorithm based on incremental storage tree to handle the problems. Use sliding window to process data stream and divide it into several basic units, apply Principal component analysis (PCA) to compress the data from window and produce dynamic incremental storage tree, use the second power strategy to continuously update several incremental storage tree. Then use multi-classifier integration technology combines with Bayesian classification to produce Bayesian classifier. At last, adjust the weight of classifier by cyclic test, and get high classification accuracy. Detailed simulation analysis demonstrates that the presented algorithm is of high efficiency of space and time and is more stable.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incremental classification using Feature Tree

In recent years, stream data have become an immensely growing area of research for the database, computer science and data mining communities. Stream data is an ordered sequence of instances. In many applications of data stream mining data can be read only once or a small number of times using limited computing and storage capabilities. Some of the issues occurred in classifying stream data tha...

متن کامل

A novel hybrid method for vocal fold pathology diagnosis based on russian language

In this paper, first, an initial feature vector for vocal fold pathology diagnosis is proposed. Then, for optimizing the initial feature vector, a genetic algorithm is proposed. Some experiments are carried out for evaluating and comparing the classification accuracies which are obtained by the use of the different classifiers (ensemble of decision tree, discriminant analysis and K-nearest neig...

متن کامل

Detecting Concept Drift in Data Stream Using Semi-Supervised Classification

Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...

متن کامل

Coal Mine Safety Evaluation Method Based on Incomplete Labeled Data Stream Classification

Monitoring data in coal mine is essentially data stream, and missing coal mine monitoring data is caused by harsh coal mine environment, therefore coal mine safety evaluation can be seen as incomplete labeled data stream classification. The method is proposed for unlabeled data and concept drift in incomplete labeled data stream in this paper that uses semi-supervised learning method based on k...

متن کامل

Comparison of Decision Tree and Naïve Bayes Methods in Classification of Researcher’s Cognitive Styles in Academic Environment

In today world of internet, it is important to feedback the users based on what they demand. Moreover, one of the important tasks in data mining is classification. Today, there are several classification techniques in order to solve the classification problems like Genetic Algorithm, Decision Tree, Bayesian and others. In this article, it is attempted to classify researchers to “Expert” and “No...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JDIM

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2012