Randomer Forests
نویسندگان
چکیده
Random forests (RF) is a popular general purpose classifier that has been shown to outperform many other classifiers on a variety of datasets. The widespread use of random forests can be attributed to several factors, some of which include its excellent empirical performance, scale and unit invariance, robustness to outliers, time and space complexity, and interpretability. While RF has many desirable qualities, one drawback is its sensitivity to rotations and other operations that “mix” variables. In this work, we establish a generalized forest building scheme, linear threshold forests. Random forests and many other currently existing decision forest algorithms can be viewed as special cases of this scheme. With this scheme in mind, we propose a few special cases which we call randomer forests (RerFs). RerFs are linear threshold forest that exhibit all of the nice properties of RF, in addition to approximate affine invariance. In simulated datasets designed for RF to do well, we demonstrate that RerF outperforms RF. We also demonstrate that one particular variant of RerF is approximately affine invariant. Lastly, in an evaluation on 121 benchmark datasets, we observe that RerF outperforms RF. We therefore putatively propose that RerF be considered a replacement for RF as the general purpose classifier of choice. Open source code is available at http: //ttomita.github.io/RandomerForest/.
منابع مشابه
Genome-wide identification of hypoxia-induced enhancer regions
Here we present a genome-wide method for de novo identification of enhancer regions. This approach enables massively parallel empirical investigation of DNA sequences that mediate transcriptional activation and provides a platform for discovery of regulatory modules capable of driving context-specific gene expression. The method links fragmented genomic DNA to the transcription of randomer mole...
متن کاملCombining capillary electrophoresis and next-generation sequencing for aptamer selection
Next-generation sequencing (NGS) machines can sequence millions of DNA strands in a single run, such as oligonucleotide (oligo) libraries comprising millions to trillions of discrete oligo sequences. Capillary electrophoresis is an attractive technique to select tight binding oligos or "aptamers" because it requires minimal sample volumes (e.g., 100 nL) and offers a solution-phase selection env...
متن کاملAssessment of Quantitative and Qualitative characteristics of Golestan Province Forests in an 11-year period (Iran)
The assessment of the quantitative and qualitative changes, the result of the impacts imposed by natural factors, and human interventions during specific sampling periods has a substantial influence on nature, management method and tending operation of every region’s forests. The present research was carried out in Golestan province forests (Iran) over an 11- year period and the obtained statis...
متن کاملRandom forests algorithm in podiform chromite prospectivity mapping in Dolatabad area, SE Iran
The Dolatabad area located in SE Iran is a well-endowed terrain owning several chromite mineralized zones. These chromite ore bodies are all hosted in a colored mélange complex zone comprising harzburgite, dunite, and pyroxenite. These deposits are irregular in shape, and are distributed as small lenses along colored mélange zones. The area has a great potential for discovering further chromite...
متن کاملThe role of forests and pastures in combating the economic vulnerability of rural forest communities during the corona outbreak
The prevalence of Corona disease as one of the most critical infectious diseases has increased the risks of food security in different parts of the world. Local communities, especially in rural areas, are highly dependent on natural resource-based ecosystem service strategies to manage global food security and meet livelihood needs. Forests and pastures can also provide goods and servic...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1506.03410 شماره
صفحات -
تاریخ انتشار 2015