Learning ELM-Tree from big data based on uncertainty reduction

نویسندگان

  • Ran Wang
  • Yu-Lin He
  • Chi-Yin Chow
  • Fang-Fang Ou
  • Jian Zhang
چکیده

A challenge in big data classification is the design of highly parallelized learning algorithms. One solution to this problem is applying parallel computation to different components of a learning model. In this paper, we first propose an extreme learning machine tree (ELM-Tree) model based on the heuristics of uncertainty reduction. In the ELM-Tree model, information entropy and ambiguity are used as the uncertainty measures for splitting decision tree (DT) nodes. Besides, in order to resolve the overpartitioning problem in the DT induction, ELMs are embedded as the leaf nodes when the gain ratios of all the available splits are smaller than a given threshold. Then, we apply parallel computation to five components of the ELM-Tree model, which effectively reduces the computational time for big data classification. Experimental studies demonstrate the effectiveness of the proposed method. © 2014 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A parallel approximate SS-ELM algorithm based on MapReduce for large-scale datasets

Extreme Learning Machine (ELM) algorithm not only has gained much attention of many scholars and researchers, but also has been widely applied in recent years especially when dealing with big data because of its better generalization performance and learning speed. The proposal of SS-ELM (semi-supervised Extreme Learning Machine) extends ELM algorithm to the area of semi-supervised learning whi...

متن کامل

Distributed Averaging CNN-ELM for Big Data

Increasing the scalability of machine learning to handle big volume of data is a challenging task. The scale up approach has some limitations. In this paper, we proposed a scale out approach for CNN-ELM based on MapReduce on classifier level. Map process is the CNN-ELM training for certain partition of data. It involves many CNN-ELM models that can be trained asynchronously. Reduce process is t...

متن کامل

Elastic extreme learning machine for big data classification

Extreme Learning Machine (ELM) and its variants have been widely used for many applications due to its fast convergence and good generalization performance. Though the distributed ELM based on MapReduce framework can handle very large scale training dataset in big data applications, how to cope with its rapidly updating is still a challenging task. Therefore, in this paper, a novel Elastic Extr...

متن کامل

Comparison of Machine Learning Algorithms for Broad Leaf Species Classification Using UAV-RGB Images

Abstract: Knowing the tree species combination of forests provides valuable information for studying the forest’s economic value, fire risk assessment, biodiversity monitoring, and wildlife habitat improvement. Fieldwork is often time-consuming and labor-required, free satellite data are available in coarse resolution and the use of manned aircraft is relatively costly. Recently, unmanned aeria...

متن کامل

Secure Multi-party Computation Based Privacy Preserving Extreme Learning Machine Algorithm Over Vertically Distributed Data

Especially in the Big Data era, the usage of different classification methods is increasing day by day. The success of these classification methods depends on the effectiveness of learning methods. Extreme learning machine (ELM) classification algorithm is a relatively new learning method built on feed-forward neural-network. ELM classification algorithm is a simple and fast method that can cre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Fuzzy Sets and Systems

دوره 258  شماره 

صفحات  -

تاریخ انتشار 2015