A survey on data integration for multi-omics sample clustering

نویسندگان

چکیده

Due to the current high availability of omics, data-driven biology has greatly expanded, and several papers have reviewed state-of-the-art technologies. Nowadays, two main types investigation are available for a multi-omics dataset: extraction relevant features meaningful biological interpretation clustering samples. In latter case, few reviews refer some outdated or no longer methods, whereas others lack description metrics compare approaches. This work provides general overview major techniques in this area, divided into four groups: graph, dimensionality reduction, statistical neural-based. Besides, eight tools been tested both on synthetic real dataset. An extensive performance comparison provided using evaluation scores: Peak Signal-to-Noise Ratio (PSNR), Davies-Bouldin(DB) index, Silhouette value harmonic mean cluster purity efficiency. The best results were obtained by either explicitly implicitly, as neural architecture.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-Omics Data Integration: A Modular Approach

Substantial extent of research has been carried out over past few decades to understand the molecular and genetic level information within the cell that help improving the methodologies in the fields of agricultural or medical research. However, often the research is focused at one particular area of expertise by an individual researcher or a work group and very less number of attempts were mad...

متن کامل

A Concise Review on Multi-Omics Data Integration for Terroir Analysis in Vitis vinifera

Vitis vinifera (grapevine) is one of the most important fruit crops, both for fresh consumption and wine and spirit production. The term terroir is frequently used in viticulture and the wine industry to relate wine sensory attributes to its geographic origin. Although, it can be cultivated in a wide range of environments, differences in growing conditions have a significant impact on fruit tra...

متن کامل

Machine learning methods for omics data integration

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv

متن کامل

A fully Bayesian latent variable model for integrative clustering analysis of multi-type omics data.

Identification of clinically relevant tumor subtypes and omics signatures is an important task in cancer translational research for precision medicine. Large-scale genomic profiling studies such as The Cancer Genome Atlas (TCGA) Research Network have generated vast amounts of genomic, transcriptomic, epigenomic, and proteomic data. While these studies have provided great resources for researche...

متن کامل

A Fuzzy C-means Algorithm for Clustering Fuzzy Data and Its Application in Clustering Incomplete Data

The fuzzy c-means clustering algorithm is a useful tool for clustering; but it is convenient only for crisp complete data. In this article, an enhancement of the algorithm is proposed which is suitable for clustering trapezoidal fuzzy data. A linear ranking function is used to define a distance for trapezoidal fuzzy data. Then, as an application, a method based on the proposed algorithm is pres...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Neurocomputing

سال: 2022

ISSN: ['0925-2312', '1872-8286']

DOI: https://doi.org/10.1016/j.neucom.2021.11.094