How important is data quality? Best classifiers vs best features

نویسندگان

چکیده

The task of choosing the appropriate classifier for a given scenario is not an easy-to-solve question. First, there increasingly high number algorithms available belonging to different families. And also lack methodologies that can help on recommending in advance family certain type datasets. Besides, most these classification exhibit degradation performance when faced with datasets containing irrelevant and/or redundant features. In this work we analyze impact feature selection over several synthetic and real experimental results obtained show significance selecting decreases after applying preprocessing step and, only alleviates choice, but it improves almost all tested.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

What Is the Best Research Globally?

Dr. Afshari’s editorial gives insight on human nature (1). Collaborative willingness usually occurs when different sides perceive personal benefit, or when institutions mandate collaboration. This is not unique between higher and lower income countries; many working in global emergency medicine (EM) observe this within wealthier countries. It exists among collaborators in low resource settings....

متن کامل

Is Combining Classifiers Better than Selecting the Best One

We empirically evaluate several state-of-theart methods for constructing ensembles of heterogeneous classifiers with stacking and show that they perform (at best) comparably to selecting the best classifier from the ensemble by cross validation. We then propose a new method for stacking, that uses multi-response model trees at the meta-level, and show that it clearly outperforms existing stacki...

متن کامل

Are Random Forests Truly the Best Classifiers?

The JMLR study Do we need hundreds of classifiers to solve real world classification problems? benchmarks 179 classifiers in 17 families on 121 data sets from the UCI repository and claims that “the random forest is clearly the best family of classifier”. In this response, we show that the study’s results are biased by the lack of a held-out test set and the exclusion of trials with errors. Fur...

متن کامل

Is Best First Search Really Best?

Of the many minimax algorithms, SSS* consistently searches the smallest game trees. Its success can be attributed to the accumulation and use of information acquired while traversing the tree, allowing a best first search strategy. The main disadvantage of SSS* is its excessive storage requirements. This paper describes a class of search algorithms which, though based on the popular alpha-beta ...

متن کامل

Surgery vs. radiotherapy in localized prostate cancer. Which is best?

Surgery and radiotherapy are currently accepted alternatives for the treatment of localized prostate cancer. In the absence of relevant randomized trials no decision regarding the superiority of any of the given approaches can be made. Up to now several cohort-based approaches indicate similar outcomes for both treatments. Based on a new population based approach, Merglen and co-workers recentl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Neurocomputing

سال: 2022

ISSN: ['0925-2312', '1872-8286']

DOI: https://doi.org/10.1016/j.neucom.2021.05.107