The most informative spacing test effectively discovers biologically relevant outliers or multiple modes in expression

نویسندگان

  • Iwona Pawlikowska
  • Gang Wu
  • Michael Edmonson
  • Zhifa Liu
  • Tanja Gruber
  • Jinghui Zhang
  • Stan Pounds
چکیده

SUMMARY Several outlier and subgroup identification statistics (OASIS) have been proposed to discover transcriptomic features with outliers or multiple modes in expression that are indicative of distinct biological processes or subgroups. Here, we borrow ideas from the OASIS methods in the bioinformatics and statistics literature to develop the 'most informative spacing test' (MIST) for unsupervised detection of such transcriptomic features. In an example application involving 14 cases of pediatric acute megakaryoblastic leukemia, MIST more robustly identified features that perfectly discriminate subjects according to gender or the presence of a prognostically relevant fusion-gene than did seven other OASIS methods in the analysis of RNA-seq exon expression, RNA-seq exon junction expression and micorarray exon expression data. MIST was also effective at identifying features related to gender or molecular subtype in an example application involving 157 adult cases of acute myeloid leukemia. AVAILABILITY MIST will be freely available in the OASIS R package at http://www.stjuderesearch.org/site/depts/biostats CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Approximation for the Null Distribution of the Likelihood Ratio Test Statistics for k Outliers in a Normal Sample

Usually when performing a statistical test or estimation procedure, we assume the data are all observations of i.i.d. random variables, often from a normal distribution. Sometimes, however, we notice in a sample one or more observations that stand out from the crowd. These observation(s) are commonly called outlier(s). Outlier tests are more formal procedures which have been developed for detec...

متن کامل

The Effects of Presenting Multiple-Choice Test Items in Oral and Written Modes and Item Types on Advanced EFL Learners’ Listening Comprehension and Perception

This quasi-experimental study aimed to compare the effect of different modes and item types of multiple-choice (MC) test items on advanced EFL learners’ listening comprehension and perception. To this end, 80 advanced EFL learners, aging 18 to 30, were selected. The participants took a listening test including dialogue-completion and question and answer multiple-choice items presented in writte...

متن کامل

Exploring Classifiability Metrics for Selecting Informative Genes

Microarray experiments are emerging as one of the main driving forces in modern biology. By allowing the simultaneous monitoring of the expression of the entire genome for a given organism, array experiments provide tremendous insight into the fundamental biological processes that translate genetic information. One of the major challenges is to identify computationally efficient and biologicall...

متن کامل

Detecting outliers in non-redundant diffraction data.

Outliers are observations which are very unlikely to be correct, as judged by independent observations or other prior information. Such unexpected observations are treated, effectively, as being more informative about possible models, so they can seriously impede the course of structure determination and refinement. The best way to detect and eliminate outliers is to collect highly redundant da...

متن کامل

Identification of Alzheimer disease-relevant genes using a novel hybrid method

Identifying genes underlying complex diseases/traits that generally involve multiple etiological mechanisms and contributing genes is difficult. Although microarray technology has enabled researchers to investigate gene expression changes, but identifying pathobiologically relevant genes remains a challenge. To address this challenge, we apply a new method for selecting the disease-relevant gen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 30 10  شماره 

صفحات  -

تاریخ انتشار 2014