Investigation of Property Valuation Models Based on Decision Tree Ensembles Built over Noised Data

نویسندگان

  • Tadeusz Lasota
  • Tomasz Luczak
  • Michal Niemczyk
  • Michal Olszewski
  • Bogdan Trawinski
چکیده

The ensemble machine learning methods incorporating bagging, random subspace, random forest, and rotation forest employing decision trees, i.e. Pruned Model Trees, as base learning algorithms were developed in WEKA environment. The methods were applied to the real-world regression problem of predicting the prices of residential premises based on historical data of sales/purchase transactions. The accuracy of ensembles generated by the methods was compared for several levels of noise injected into an attribute, output, and both attribute and output. Ensembles built using rotation forest outperformed other models. In turn, random subspace method resulted in the models that were the most resistant to noised data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of liquefaction potential based on CPT results using C4.5 decision tree

The prediction of liquefaction potential of soil due to an earthquake is an essential task in Civil Engineering. The decision tree is a tree structure consisting of internal and terminal nodes which process the data to ultimately yield a classification. C4.5 is a known algorithm widely used to design decision trees. In this algorithm, a pruning process is carried out to solve the problem of the...

متن کامل

The Investigation of Deep Data Representations Based on Decision Tree Ensembles for Classification Problems

A classification method based on deep representation of input data and ensembles of decision trees is introduced and evaluated solving the problem of vehicle classification and image classification with large number of categories.

متن کامل

On Oblique Random Forests

Abstract. In his original paper on random forests, Breiman proposed two different decision tree ensembles: one generated from “orthogonal” trees with thresholds on individual features in every split, and one from “oblique” trees separating the feature space by randomly oriented hyperplanes. In spite of a rising interest in the random forest framework, however, ensembles built from orthogonal tr...

متن کامل

THE VALUATION OF PATENTS : A review of patent valuation methods with consideration of option based methods and the potential for further research

Intellectual Property Rights (IPRs) are viewed as being of increasing importance in many fields of business. However, one potential hindrance to their being considered of significant value, is the lack of appreciation of practical methods of valuing them particularly early in their life under conditions of uncertainty about their future prospects. Lack of practical valuation methods under such ...

متن کامل

The Generalization Paradox of Ensembles

Ensemble models—built by methods such as bagging, boosting, and Bayesian model averaging—appear dauntingly complex, yet tend to strongly outperform their component models on new data. Doesn’t this violate “Occam’s razor”—the widespread belief that “the simpler of competing alternatives is preferred”? We argue no: if complexity is measured by function rather than form—for example, according to g...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013