A Scoring Method for the Clustering of Nucleic Acid Sequences
نویسندگان
چکیده
The clustering of biological sequence data is a significant task for biologists. The reason is that sequence clustering assists molecular biologists to group sequences based on the ancestral traits or hereditary information that are hidden in sequences. To accomplish the similarity detection and clustering tasks, several clustering algorithms, similarity and distance measures have been proposed. Most of these algorithms and similarity measures manifest some form of inefficiency in the detection of sequences based on their structural similarity as was observed in the course of this study. In this paper, the codon-based scoring method (COBASM) is developed to handle this inefficiency. COBASM employs the codon principle, by the application of triplet nucleotides, in the clustering of nucleic acid sequences. The results obtained show that COBASM is able to produce compact and wellseparated clusters based on the structural similarity of sequences.
منابع مشابه
Investigating the Particle Swarm Optimization Clustering Method on Nucleic Acid Sequences
Particle swarm optimization (PSO) has been employed on several optimization problems, including the clustering problem. PSO has also been employed in the clustering of data of different structure and dimensionality. In this paper it is employed in the clustering of nucleic acid sequences. The application of clustering, as a statistical tool, in the analysis of data of varied complexity has been...
متن کاملPhylogenetic and sequence analysis of the growth hormone gene of two sturgeons, Huso huso and Acipenser Gueldenstaedtii
In this study, the cDNA Growth Hormone (cGH) of the Belugasturgeon (Husohuso) and Russian sturgeon (Acipensergueldenstaedtii) were cloned and sequenced, and phylogenetic relationships were examined using nucleic acid and amino acid sequences. The nucleotide sequence of the Beluga GH has an open reading frame of 645 nucleotides encoding a protein 214 amino acid residues. The signal peptide cleav...
متن کاملSignal processing approaches as novel tools for the clustering of N-acetyl-β-D-glucosaminidases
Nowadays, the clustering of proteins and enzymes in particular, are one of the most popular topics in bioinformatics. Increasing number of chitinase genes from different organisms and their sequences have beenidentified. So far, various mathematical algorithms for the clustering of chitinase genes have been used butmost of them seem to be confusing and sometimes insufficient. In the...
متن کاملDesigning a Label Free Aptasensor for Detection of Methamphetamine
A label-free electrochemical nucleic acid aptasensor for the detection of methamphetamine (MA) by the immobilization of thiolated self-assembled DNA sequences on a gold nanoparticles-chitosan modified electrode is constructed. When MA was complexed specifically to the aptamer, the configuration of the nucleic acid aptamer switched to a locked structure and the interface of the biosensor changed...
متن کاملOptimization of the Analysis of Almond DNA Simple Sequence Repeats (SSRs) Through Submarine Electrophoresis Using Different Agaroses and Staining Protocols
Simple sequence repeat (SSR markers or microsatellites), based on the specific PCR amplification of DNA sequences, are becoming the markers of choice for molecular characterization of a wide range of plants because of their high polymorphism, abundance, and codominant inheritance. Different methods have been used for the analysis of the SSR amplified fragments being submarine agarose electropho...
متن کامل