Geometric Crossover for Supervised Motif Discovery
نویسندگان
چکیده
Motif discovery is a general and important problem in bioinformatics, as motifs often are used to infer biologically important sites in bio-molecular sequences. Many problems in bioinformatics are naturally cast in terms of sequences, and distance measures for sequences derived from edit distance is fundamental in bioinformatics. Geometric Crossover is a representation-independent definition of crossover based on a distance on the solution space. Using a distance measure that is tailored to the problem at hand allows the design of crossovers that embed problem knowledge in the search. In this paper we apply this theoretically motivated operator to motif discovery in protein sequences and report encouraging experimental results.
منابع مشابه
Geometric Crossover for Supervised Motif Discovery
Motif discovery is a general and important problem in bioinformatics, as motifs often are used to infer biologically important sites in bio-molecular sequences. Many problems in bioinformatics are naturally cast in terms of sequences, and distance measures for sequences derived from edit distance is fundamental in bioinformatics. Geometric Crossover is a representation-independent definition of...
متن کاملDevelopment of an Efficient Hybrid Method for Motif Discovery in DNA Sequences
This work presents a hybrid method for motif discovery in DNA sequences. The proposed method called SPSO-Lk, borrows the concept of Chebyshev polynomials and uses the stochastic local search to improve the performance of the basic PSO algorithm as a motif finder. The Chebyshev polynomial concept encourages us to use a linear combination of previously discovered velocities beyond that proposed b...
متن کاملMultidimensional Motif Discovery in Physiological and Biomedical Time Series Data
Providing personalized diagnosis and therapy requires monitoring patient activity using various body sensors. Sensor data generated during personalized exercises or tasks may be too specific or inadequate to be reviewed and evaluated using supervised methods such as classification. We propose multidimensional time series motif discovery as a means for patient activity monitoring, since such mot...
متن کاملA tree-based approach for motif discovery and sequence classification
MOTIVATION Pattern discovery algorithms are widely used for the analysis of DNA and protein sequences. Most algorithms have been designed to find overrepresented motifs in sparse datasets of long sequences, and ignore most positional information. We introduce an algorithm optimized to exploit spatial information in sparse-but-populous datasets. RESULTS Our algorithm Tree-based Weighted-Positi...
متن کاملTraining Set Design for Pattern Discovery with Applications to Protein Motif Detection
Supervised pattern discovery techniques have been successfully used for motif detection. However, this requires the use of an efficient training set. Even in cases where a lot of examples are known, using all the available examples can introduce bias during the training process. In practice, this is done with the help of domain experts. Whenever such expertise is not available, training sets ar...
متن کامل