Protein sequence classification using feature hashing
نویسندگان
چکیده
منابع مشابه
Protein Sequence Classification Using Feature Selection
The sheer volume of data today and its expected growth over the next years are some of the key challenges in data mining and knowledge discovery applications. Besides the huge number of data samples that are collected and processed, the high dimensional nature of data arising in many applications causes the need to develop effective and efficient techniques that are able to deal with this massi...
متن کاملK-means-based Feature Learning for Protein Sequence Classification
Protein sequence classification has been a major challenge in bioinformatics and related fields for some time and remains so today. Due to the complexity and volume of protein data, algorithmic techniques such as sequence alignment are often unsuitable due to time and memory constraints. Heuristic methods based on machine learning are the dominant technique for classifying large sets of protein...
متن کاملLocality-Sensitive Hashing for Protein Classification
Determination of sequence similarity is a central issue in computational biology, a problem addressed primarily through BLAST, an alignment based heuristic which has underpinned much of the analysis and annotation of the genomic era. Despite their success, alignment-based approaches scale poorly with increasing data set size, and are not robust under structural sequence rearrangements. Successi...
متن کاملHashing Based Hierarchical Feature Representation for Hyperspectral Imagery Classification
Integrating spectral and spatial information is proved effective in improving the accuracy of hyperspectral imagery classification. In recent studies, two kinds of approaches are widely investigated: (1) developing a multiple feature fusion (MFF) strategy; and (2) designing a powerful spectral-spatial feature extraction (FE) algorithm. In this paper, we combine the advantages of MFF and FE, and...
متن کاملSequence-Based Classification Using Discriminatory Motif Feature Selection
Most existing methods for sequence-based classification use exhaustive feature generation, employing, for example, all k-mer patterns. The motivation behind such (enumerative) approaches is to minimize the potential for overlooking important features. However, there are shortcomings to this strategy. First, practical constraints limit the scope of exhaustive feature generation to patterns of le...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proteome Science
سال: 2012
ISSN: 1477-5956
DOI: 10.1186/1477-5956-10-s1-s14