Alignment of protein mass spectrometry data by integrated Markov chain shifting method

نویسندگان

  • Yang Feng
  • Weiping Ma
  • Zhanfeng Wang
  • Zhiliang Ying
  • Yaning Yang
چکیده

Mass spectrometers such as SELDI-TOF (surface enhanced laser desorption/ionization time-of-flight) and MALDI-TOF (matrix assisted laser desorption and ionization time-of-flight) measure the relative abundance of different protein ions or protein fragments (peptides) indexed by the mass-to-charge ratio (m/z). A special characteristic of the MS spectra is its variabilities in both m/z values and intensity magnitudes. We propose modelling the logintensities by a semiparametric model and the m/z by the integrated Markov chain shifting (IMS) model, for which the second-order differences of the random effects are assumed to follow a second-order Markov chain. Alignment of spectra is done through averaging over the random shifts conditional on the observed intensity information. The unknown parameters are estimated by an iterative nonparametric maximum profile likelihood method and a Gaussian kernel approximation. The bandwidths in kernel approximation are taken to be 0.04%–0.08% of the m/z values. Simulation results show that the proposed approach can achieve satisfactory alignment by reducing the intensity variations of the misalignment spectra by a factor of around 75%. Most alignment algorithms align spectra by clustering neighboring peaks and do not incorporate peak height information. Our semiparametric random shifting method builds a model taking into consideration of both the random shift effects of neighboring m/z values and similarity of the intensity magnitudes of common peaks within the ranges of about 50% of the intensity values.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian Mass Spectra Peak Alignment from Mass Charge Ratios

Proteomics studies based on mass spectrometry (MS) are gaining popular applications in biomedical research for protein identification/quantification and biomarker discovery, especially for potential early diagnosis and prognosis of severe disease before the occurrence of symptoms. However, MS data collected using current technologies are very noisy and appropriate data preprocessing is critical...

متن کامل

A generalization of Profile Hidden Markov Model (PHMM) using one-by-one dependency between sequences

The Profile Hidden Markov Model (PHMM) can be poor at capturing dependency between observations because of the statistical assumptions it makes. To overcome this limitation, the dependency between residues in a multiple sequence alignment (MSA) which is the representative of a PHMM can be combined with the PHMM. Based on the fact that sequences appearing in the final MSA are written based on th...

متن کامل

A blended model for estimating of missing precipitation data (Case study of Tehran - Mehrabad station)

Meteorological stations usually contain some missing data for different reasons.There are several traditional methods for completing data, among them bivariate and multivariate linear and non-linear correlation analysis, double mass curve, ratio and difference methods, moving average and probability density functions are commonly used. In this paper a blended model comprising the bivariate expo...

متن کامل

Goal Programming Optimization Model for Performance Management: A SCOR-Based Supply Chain Decision Alignment

This article develops an integrated model of transmitting strategies and operational activities to enhance the efficiency of supply chain management. As the second objective, this paper aims to improve supply chain performance management (SCPM) by employing proper decision-making approaches. The proposed model optimizes the performance indicator based on SCOR metrics. A process-based method is ...

متن کامل

COMPARISON ABILITY OF GA AND DP METHODS FOR OPTIMIZATION OF RELEASED WATER FROM RESERVOIR DAM BASED ON PRODUCED DIFFERENT SCENARIOS BY MARKOV CHAIN METHOD

Planning for supply water demands (drinkable and irrigation water demands) is a necessary problem. For this purpose, three subjects must be considered (optimization of water supply systems such as volume of reservoir dams, optimization of released water from reservoir and prediction of next droughts). For optimization of volume of reservoir dams, yield model is applied. Reliability of yield mod...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009