Optimal Sparse Regression Trees
نویسندگان
چکیده
Regression trees are one of the oldest forms AI models, and their predictions can be made without a calculator, which makes them broadly useful, particularly for high-stakes applications. Within large literature on regression trees, there has been little effort towards full provable optimization, mainly due to computational hardness problem. This work proposes dynamic programming-with-bounds approach construction provably-optimal sparse trees. We leverage novel lower bound based an optimal solution k-Means clustering algorithm dimensional data. often able find in seconds, even challenging datasets that involve numbers samples highly-correlated features.
منابع مشابه
Optimal Partitioning for Classification and Regression Trees
In designing a decision tree for classification or regression, one selects at each node a feature to be tested, and partitions the range of that feature into a small number of bins, each bin corresponding to a child of the node. When the feature’s range is discrete with N unordered outcomes, the optimal partition, that is, the partition minimizing an expected loss, is usually found by an exhaus...
متن کاملMultivariate Dyadic Regression Trees for Sparse Learning Problems
We propose a new nonparametric learning method based on multivariate dyadic regression trees (MDRTs). Unlike traditional dyadic decision trees (DDTs) or classification and regression trees (CARTs), MDRTs are constructed using penalized empirical risk minimization with a novel sparsity-inducing penalty. Theoretically, we show that MDRTs can simultaneously adapt to the unknown sparsity and smooth...
متن کاملNearly Optimal Minimax Estimator for High Dimensional Sparse Linear Regression
We present estimators for a well studied statistical estimation problem: the estimation for the linear regression model with soft sparsity constraints (`q constraint with 0 < q ≤ 1) in the high-dimensional setting. We first present a family of estimators, called the projected nearest neighbor estimator and show, by using results from Convex Geometry, that such estimator is within a logarithmic ...
متن کاملGlobally Optimal Fuzzy Decision Trees for Classification and Regression
ÐA fuzzy decision tree is constructed by allowing the possibility of partial membership of a point in the nodes that make up the tree structure. This extension of its expressive capabilities transforms the decision tree into a powerful functional approximant that incorporates features of connectionist methods, while remaining easily interpretable. Fuzzification is achieved by superimposing a fu...
متن کاملOutlier Detection by Boosting Regression Trees
A procedure for detecting outliers in regression problems is proposed. It is based on information provided by boosting regression trees. The key idea is to select the most frequently resampled observation along the boosting iterations and reiterate after removing it. The selection criterion is based on Tchebychev’s inequality applied to the maximum over the boosting iterations of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i9.26334