A wavelet-based data pre-processing analysis approach in mass spectrometry

نویسندگان

  • Xiaoli Li
  • Jin Li
  • Xin Yao
چکیده

Recently, mass spectrometry analysis has a become an effective and rapid approach in detecting early-stage cancer. To identify proteomic patterns in serum to discriminate cancer patients from normal individuals, machine-learning methods, such as feature selection and classification, have already been involved in the analysis of mass spectrometry (MS) data with some success. However, the performance of existing machine learning methods for MS data analysis still needs improving. The study in this paper proposes a wavelet-based pre-processing approach to MS data analysis. The approach applies wavelet-based transforms to MS data with the aim of de-noising the data that are potentially contaminated in acquisition. The effects of the selection of wavelet function and decomposition level on the de-noising performance have also been investigated in this study. Our comparative experimental results demonstrate that the proposed de-noising pre-processing approach has potentials to remove possible noise embedded in MS data, which can lead to improved performance for existing machine learning methods in cancer detection.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A novel wavelet-based thresholding method for the pre-processing of mass spectrometry data that accounts for heterogeneous noise.

In recent years there has been an increased interest in using protein mass spectroscopy to discriminate diseased from healthy individuals with the aim of discovering molecular markers for disease. A crucial step before any statistical analysis is the pre-processing of the mass spectrometry data. Statistical results are typically strongly affected by the specific pre-processing techniques used. ...

متن کامل

Application of dual tree complex wavelet transform in tandem mass spectrometry

Mass Spectrometry (MS) is a widely used technique in molecular biology for high throughput identification and sequencing of peptides (and proteins). Tandem mass spectrometry (MS/MS) is a specialised mass spectrometry technique whereby the sequence of peptides can be determined. Preprocessing of the MS/MS data is indispensable before performing any statistical analysis on the data. In this work,...

متن کامل

A dynamic wavelet-based algorithm for pre-processing tandem mass spectrometry data

MOTIVATION Mass spectrometry (MS)-based proteomics is one of the most commonly used research techniques for identifying and characterizing proteins in biological and medical research. The identification of a protein is the critical first step in elucidating its biological function. Successful protein identification depends on various interrelated factors, including effective analysis of MS data...

متن کامل

Peaks detection and alignment for mass spectrometry data

The goal of this paper is to review existing methods for protein mass spectrometry data analysis, and to present a new methodology for automatic extraction of significant peaks (biomarkers). For the pre-processing step required for data from MALDI-TOF or SELDITOF spectra, we use a purely nonparametric approach that combines stationary invariant wavelet transform for noise removal and penalized ...

متن کامل

Mathematical Tools and Statistical Techniques for Proteomic Data Mining

Proteomics is the study of and the search for information about proteins. The development of mass spectrometry (MS) such as matrix-assisted laser desorption ionization (MALDI) time-of-flight (TOF) MS and imaging mass spectrometry (IMS), greatly speeds up proteomics studies. At the same time, the MS and IMS applications in medical science give rise to many challenges in mathematics and statistic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computers in biology and medicine

دوره 37 4  شماره 

صفحات  -

تاریخ انتشار 2007