Nonparametric clustering approach towards big data
نویسندگان
چکیده
Clustering in bioinformatics is a fundamental process involving computational issues that are far from being resolved. In our work, we propose a new approach to this problem and show preliminary comparisons to current leading methods in the field.
منابع مشابه
به کارگیری روشهای خوشهبندی در ریزآرایه DNA
Background: Microarray DNA technology has paved the way for investigators to expressed thousands of genes in a short time. Analysis of this big amount of raw data includes normalization, clustering and classification. The present study surveys the application of clustering technique in microarray DNA analysis. Materials and methods: We analyzed data of Van’t Veer et al study dealing with BRCA1...
متن کاملTowards Optimal Execution of Density-based Clustering on Heterogeneous Hardware
Data Clustering is an important and highly utilized data mining technique in various application domains. With ever increasing data volumes in the era of big data, the efficient execution of clustering algorithms is a fundamental prerequisite to gain understanding and acquire novel, previously unknown knowledge from data. To establish an efficient execution, the clustering algorithms have to be...
متن کاملA model-based clustering method to detect infectious disease transmission outbreaks from sequence variation
Clustering infections by genetic similarity is a popular technique for identifying potential outbreaks of infectious disease, in part because sequences are now routinely collected for clinical management of many infections. A diverse number of nonparametric clustering methods have been developed for this purpose. These methods are generally intuitive, rapid to compute, and readily scale with la...
متن کاملAn Ensemble Clustering for Mining High-dimensional Biological Big Data
Clustering of high-dimensional biological big data is incredibly difficult and challenging task, as the data space is often too big and too messy. The conventional clustering methods can be inefficient and ineffective on high-dimensional biological big data, because traditional distance measures may be dominated by the noise in many dimensions. An additional challenge in biological big data is ...
متن کاملBig Trajectory Data Analysis for Clustering and Anomaly Detection
We’ve been developing a sensor that can acquire positional data. Recently, a position-based big data creation is easy task and trajectory analysis is the highest priority for ”position-based service”. Traffic congestion, marketing mining, and pattern analysis are the one of the examples in trajectory analysis field. In this paper, we propose the trajectory analysis approach for clustering and a...
متن کامل