PCA and PLS with very large data sets
نویسندگان
چکیده
منابع مشابه
Kernel PLS-SVC for Linear and Nonlinear Classification
A new method for classification is proposed. This is based on kernel orthonormalized partial least squares (PLS) dimensionality reduction of the original data space followed by a support vector classifier. Unlike principal component analysis (PCA), which has previously served as a dimension reduction step for discrimination problems, orthonormalized PLS is closely related to Fisher’s approach t...
متن کاملAssessing Patient Survival Using Microarray Gene Expression Data Via Partial Least Squares Proportional Hazard Regression
High dimensional data sets from microarray experiments where the number of variables (genes) p far exceed the number of samples N render most traditional statistical tools of little direct use. However, some of these statistical tools when used in conjunction with an appropriate dimension reduction method can be effective. In this paper we introduce the use the proportional hazard (PH) regressi...
متن کاملData fusion of Fourier transform infrared spectra and powder X-ray diffraction patterns for pharmaceutical mixtures.
Fusing complex data from two disparate sources has been demonstrated to improve the accuracy in quantifying active ingredients in mixtures of pharmaceutical powders. A four-component simplex-centroid design was used to prepare blended powder mixtures of acetaminophen, caffeine, aspirin and ibuprofen. The blends were analyzed by Fourier transform infra-red spectroscopy (FTIR) and powder X-ray di...
متن کاملClassification of Microarrays with kNN: Comparison of Dimensionality Reduction Methods
Dimensionality reduction can often improve the performance of the k-nearest neighbor classifier (kNN) for high-dimensional data sets, such as microarrays. The effect of the choice of dimensionality reduction method on the predictive performance of kNN for classifying microarray data is an open issue, and four common dimensionality reduction methods, Principal Component Analysis (PCA), Random Pr...
متن کاملProteochemometrics modeling of carbonic anhydrate-substrate interactions using rule-based and linear methods
Proteochemometrics is a novel technology for the analysis of interaction of series of receptors with series of ligands. In this study, two different proteochemometric approaches, rough sets and partial least squares (PLS), have been utilized to model the interaction of Carbonic anhydrases (CA I, CA II, CA V) and their ligands. Both approaches analyzed the dataset which cor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computational Statistics & Data Analysis
دوره 48 شماره
صفحات -
تاریخ انتشار 2005