khmer release v2.1: software for biological sequence analysis
نویسندگان
چکیده
منابع مشابه
The khmer software package: enabling efficient nucleotide sequence analysis
The khmer package is a freely available software library for working efficiently with fixed length DNA words, or k-mers. khmer provides implementations of a probabilistic k-mer counting data structure, a compressible De Bruijn graph representation, De Bruijn graph partitioning, and digital normalization. khmer is implemented in C++ and Python, and is freely available under the BSD license at h...
متن کاملSequence Complexity for Biological Sequence Analysis
A new statistical model for DNA considers a sequence to be a mixture of regions with little structure and regions that are approximate repeats of other subsequences, i.e. instances of repeats do not need to match each other exactly. Both forward- and reverse-complementary repeats are allowed. The model has a small number of parameters which are fitted to the data. In general there are many expl...
متن کاملBiological sequence analysis
This talk will review a little over a decade’s research on applying certain stochastic models to biological sequence analysis. The models themselves have a longer history, going back over 30 years, although many novel variants have arisen since that time. The function of the models in biological sequence analysis is to summarize the information concerning what is known as a motif or a domain in...
متن کاملBiological Sequence Analysis
Background The schematic for every living organism is stored in long molecules known as chromosomes made of a substance known as DNA (deoxyribonucleic acid. Each cell in an organism has a complete copy of its DNA, also known as its genomewhich is conveniently modeled as a sequence of symbols (alternately referred to as nucleotides or bases) in the DNA alphabet {A,C,T,G}. In humans, and most mam...
متن کاملUnified Gibbs Method for Biological Sequence Analysis
The biotechnology revolution stems from rapid advances in the biological sciences. One important product of these advances is a large and rapidly growing data base of biopolymer (DNA, RNA, and protein) sequences, which has attracted much attention from researchers in diierent elds. The great majority of the techniques generated for studying these data have been designed to analyze a single sequ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Journal of Open Source Software
سال: 2017
ISSN: 2475-9066
DOI: 10.21105/joss.00272