Bayesian Conditional Tensor Factorizations for High-Dimensional Classification
نویسندگان
چکیده
In many application areas, data are collected on a categorical response and high-dimensional categorical predictors, with the goals being to build a parsimonious model for classification while doing inferences on the important predictors. In settings such as genomics, there can be complex interactions among the predictors. By using a carefully-structured Tucker factorization, we define a model that can characterize any conditional probability, while facilitating variable selection and modeling of higher-order interactions. Following a Bayesian approach, we propose a Markov chain Monte Carlo algorithm for posterior computation accommodating uncertainty in the predictors to be included. Under near low rank assumptions, the posterior distribution for the conditional probability is shown to achieve close to the parametric rate of contraction even in ultra high-dimensional settings. The methods are illustrated using simulation examples and biomedical applications.
منابع مشابه
Regularized Tensor Factorizations and Higher-Order Principal Components Analysis
High-dimensional tensors or multi-way data are becoming prevalent in areas such as biomedical imaging, chemometrics, networking and bibliometrics. Traditional approaches to finding lower dimensional representations of tensor data include flattening the data and applying matrix factorizations such as principal components analysis (PCA) or employing tensor decompositions such as the CANDECOMP / P...
متن کاملGaussian Process Vine Copulas for Multivariate Dependence
Copulas allow to learn marginal distributions separately from the multivariate dependence structure (copula) that links them together into a density function. Vine factorizations ease the learning of high-dimensional copulas by constructing a hierarchy of conditional bivariate copulas. However, to simplify inference, it is common to assume that each of these conditional bivariate copulas is ind...
متن کاملConditional Random Fields for Airborne Lidar Point Cloud Classification in Urban Area
Over the past decades, urban growth has been known as a worldwide phenomenon that includes widening process and expanding pattern. While the cities are changing rapidly, their quantitative analysis as well as decision making in urban planning can benefit from two-dimensional (2D) and three-dimensional (3D) digital models. The recent developments in imaging and non-imaging sensor technologies, s...
متن کاملNumerical methods in higher dimensions using tensor factorizations
Numerical methods in higher dimensions using tensor factorizations I this talk I will collect recent advances in the solution of high-dimensional problems in different application areas: chemistry, biology, mathematics. The language of low-rank factorization gives a unified view on different algorithms for the solution of seemingly diverse and unconnected problems. Typical applications include ...
متن کاملNovel Alternating Least Squares Algorithm for Nonnegative Matrix and Tensor Factorizations
Alternative least squares (ALS) algorithm is considered as a "work-horse" algorithm for general tensor factorizations. For nonnegative tensor factorizations (NTF), we usually use a nonlinear projection (rectifier) to remove negative entries during the iteration process. However, this kind of ALS algorithm often fails and cannot converge to the desired solution. In this paper, we proposed a nove...
متن کامل