Randomer Forests

نویسندگان

  • Tyler M. Tomita
  • Mauro Maggioni
  • Joshua T. Vogelstein
چکیده

Random forests (RF) is a popular general purpose classifier that has been shown to outperform many other classifiers on a variety of datasets. The widespread use of random forests can be attributed to several factors, some of which include its excellent empirical performance, scale and unit invariance, robustness to outliers, time and space complexity, and interpretability. While RF has many desirable qualities, one drawback is its sensitivity to rotations and other operations that “mix” variables. In this work, we establish a generalized forest building scheme, linear threshold forests. Random forests and many other currently existing decision forest algorithms can be viewed as special cases of this scheme. With this scheme in mind, we propose a few special cases which we call randomer forests (RerFs). RerFs are linear threshold forest that exhibit all of the nice properties of RF, in addition to approximate affine invariance. In simulated datasets designed for RF to do well, we demonstrate that RerF outperforms RF. We also demonstrate that one particular variant of RerF is approximately affine invariant. Lastly, in an evaluation on 121 benchmark datasets, we observe that RerF outperforms RF. We therefore putatively propose that RerF be considered a replacement for RF as the general purpose classifier of choice. Open source code is available at http: //ttomita.github.io/RandomerForest/.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Genome-wide identification of hypoxia-induced enhancer regions

Here we present a genome-wide method for de novo identification of enhancer regions. This approach enables massively parallel empirical investigation of DNA sequences that mediate transcriptional activation and provides a platform for discovery of regulatory modules capable of driving context-specific gene expression. The method links fragmented genomic DNA to the transcription of randomer mole...

متن کامل

Combining capillary electrophoresis and next-generation sequencing for aptamer selection

Next-generation sequencing (NGS) machines can sequence millions of DNA strands in a single run, such as oligonucleotide (oligo) libraries comprising millions to trillions of discrete oligo sequences. Capillary electrophoresis is an attractive technique to select tight binding oligos or "aptamers" because it requires minimal sample volumes (e.g., 100 nL) and offers a solution-phase selection env...

متن کامل

Assessment of Quantitative and Qualitative characteristics of Golestan Province Forests in an 11-year period (Iran)

The assessment of the quantitative and qualitative changes, the result of the impacts imposed by natural factors, and human interventions during specific sampling periods has a substantial influence on nature, management method and tending operation of every region’s forests. The present research was carried out in Golestan province forests (Iran) over an 11- year period and the obtained statis...

متن کامل

Random forests algorithm in podiform chromite prospectivity mapping in Dolatabad area, SE Iran

The Dolatabad area located in SE Iran is a well-endowed terrain owning several chromite mineralized zones. These chromite ore bodies are all hosted in a colored mélange complex zone comprising harzburgite, dunite, and pyroxenite. These deposits are irregular in shape, and are distributed as small lenses along colored mélange zones. The area has a great potential for discovering further chromite...

متن کامل

The role of forests and pastures in combating the economic vulnerability of rural forest communities during the corona outbreak

        The prevalence of Corona disease as one of the most critical infectious diseases has increased the risks of food security in different parts of the world. Local communities, especially in rural areas, are highly dependent on natural resource-based ecosystem service strategies to manage global food security and meet livelihood needs. Forests and pastures can also provide goods and servic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1506.03410  شماره 

صفحات  -

تاریخ انتشار 2015