PCA and PLS with very large data sets

نویسندگان

  • Nouna Kettaneh
  • Anders Berglund
  • Svante Wold
چکیده

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Kernel PLS-SVC for Linear and Nonlinear Classification

A new method for classification is proposed. This is based on kernel orthonormalized partial least squares (PLS) dimensionality reduction of the original data space followed by a support vector classifier. Unlike principal component analysis (PCA), which has previously served as a dimension reduction step for discrimination problems, orthonormalized PLS is closely related to Fisher’s approach t...

متن کامل

Assessing Patient Survival Using Microarray Gene Expression Data Via Partial Least Squares Proportional Hazard Regression

High dimensional data sets from microarray experiments where the number of variables (genes) p far exceed the number of samples N render most traditional statistical tools of little direct use. However, some of these statistical tools when used in conjunction with an appropriate dimension reduction method can be effective. In this paper we introduce the use the proportional hazard (PH) regressi...

متن کامل

Data fusion of Fourier transform infrared spectra and powder X-ray diffraction patterns for pharmaceutical mixtures.

Fusing complex data from two disparate sources has been demonstrated to improve the accuracy in quantifying active ingredients in mixtures of pharmaceutical powders. A four-component simplex-centroid design was used to prepare blended powder mixtures of acetaminophen, caffeine, aspirin and ibuprofen. The blends were analyzed by Fourier transform infra-red spectroscopy (FTIR) and powder X-ray di...

متن کامل

Classification of Microarrays with kNN: Comparison of Dimensionality Reduction Methods

Dimensionality reduction can often improve the performance of the k-nearest neighbor classifier (kNN) for high-dimensional data sets, such as microarrays. The effect of the choice of dimensionality reduction method on the predictive performance of kNN for classifying microarray data is an open issue, and four common dimensionality reduction methods, Principal Component Analysis (PCA), Random Pr...

متن کامل

Proteochemometrics modeling of carbonic anhydrate-substrate interactions using rule-based and linear methods

Proteochemometrics is a novel technology for the analysis of interaction of series of receptors with series of ligands. In this study, two different proteochemometric approaches, rough sets and partial least squares (PLS), have been utilized to model the interaction of Carbonic anhydrases (CA I, CA II, CA V) and their ligands. Both approaches analyzed the dataset which cor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Statistics & Data Analysis

دوره 48  شماره 

صفحات  -

تاریخ انتشار 2005