A survey on data integration for multi-omics sample clustering
نویسندگان
چکیده
Due to the current high availability of omics, data-driven biology has greatly expanded, and several papers have reviewed state-of-the-art technologies. Nowadays, two main types investigation are available for a multi-omics dataset: extraction relevant features meaningful biological interpretation clustering samples. In latter case, few reviews refer some outdated or no longer methods, whereas others lack description metrics compare approaches. This work provides general overview major techniques in this area, divided into four groups: graph, dimensionality reduction, statistical neural-based. Besides, eight tools been tested both on synthetic real dataset. An extensive performance comparison provided using evaluation scores: Peak Signal-to-Noise Ratio (PSNR), Davies-Bouldin(DB) index, Silhouette value harmonic mean cluster purity efficiency. The best results were obtained by either explicitly implicitly, as neural architecture.
منابع مشابه
Multi-Omics Data Integration: A Modular Approach
Substantial extent of research has been carried out over past few decades to understand the molecular and genetic level information within the cell that help improving the methodologies in the fields of agricultural or medical research. However, often the research is focused at one particular area of expertise by an individual researcher or a work group and very less number of attempts were mad...
متن کاملA Concise Review on Multi-Omics Data Integration for Terroir Analysis in Vitis vinifera
Vitis vinifera (grapevine) is one of the most important fruit crops, both for fresh consumption and wine and spirit production. The term terroir is frequently used in viticulture and the wine industry to relate wine sensory attributes to its geographic origin. Although, it can be cultivated in a wide range of environments, differences in growing conditions have a significant impact on fruit tra...
متن کاملMachine learning methods for omics data integration
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
متن کاملA fully Bayesian latent variable model for integrative clustering analysis of multi-type omics data.
Identification of clinically relevant tumor subtypes and omics signatures is an important task in cancer translational research for precision medicine. Large-scale genomic profiling studies such as The Cancer Genome Atlas (TCGA) Research Network have generated vast amounts of genomic, transcriptomic, epigenomic, and proteomic data. While these studies have provided great resources for researche...
متن کاملA Fuzzy C-means Algorithm for Clustering Fuzzy Data and Its Application in Clustering Incomplete Data
The fuzzy c-means clustering algorithm is a useful tool for clustering; but it is convenient only for crisp complete data. In this article, an enhancement of the algorithm is proposed which is suitable for clustering trapezoidal fuzzy data. A linear ranking function is used to define a distance for trapezoidal fuzzy data. Then, as an application, a method based on the proposed algorithm is pres...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Neurocomputing
سال: 2022
ISSN: ['0925-2312', '1872-8286']
DOI: https://doi.org/10.1016/j.neucom.2021.11.094