Optimizing nucleotide sequence ensembles for combinatorial protein libraries using a genetic algorithm
نویسندگان
چکیده
Protein libraries are essential to the field of protein engineering. Increasingly, probabilistic protein design is being used to synthesize combinatorial protein libraries, which allow the protein engineer to explore a vast space of amino acid sequences, while at the same time placing restrictions on the amino acid distributions. To this end, if site-specific amino acid probabilities are input as the target, then the codon nucleotide distributions that match this target distribution can be used to generate a partially randomized gene library. However, it turns out to be a highly nontrivial computational task to find the codon nucleotide distributions that exactly matches a given target distribution of amino acids. We first showed that for any given target distribution an exact solution may not exist at all. Formulated as a constrained optimization problem, we then developed a genetic algorithm-based approach to find codon nucleotide distributions that match as closely as possible to the target amino acid distribution. As compared with the previous gradient descent method on various objective functions, the new method consistently gave more optimized distributions as measured by the relative entropy between the calculated and the target distributions. To simulate the actual lab solutions, new objective functions were designed to allow for two separate sets of codons in seeking a better match to the target amino acid distribution.
منابع مشابه
Study on Genetic Diversity of Terminal Fragment Sequence of Isolated Persian Tobacco Mosaic Virus
Tobacco mosaic virus (TMV) is one of the devastating plant viruses in the world that infects more than 200 plant species. Movement protein plays a supportive role in the movement of other plant viruses, and viral coat protein is highly expressed in infected plants and affects replication and movements of TMV. In order to investigate genetic variation in the terminal fragment sequence in Iranian...
متن کاملOptimizing image steganography by combining the GA and ICA
In this study, a novel approach which uses combination of steganography and cryptography for hiding information into digital images as host media is proposed. In the process, secret data is first encrypted using the mono-alphabetic substitution cipher method and then the encrypted secret data is embedded inside an image using an algorithm which combines the random patterns based on Space Fillin...
متن کاملA Robust and Versatile Method of Combinatorial Chemical Synthesis of Gene Libraries via Hierarchical Assembly of Partially Randomized Modules
A major challenge in gene library generation is to guarantee a large functional size and diversity that significantly increases the chances of selecting different functional protein variants. The use of trinucleotides mixtures for controlled randomization results in superior library diversity and offers the ability to specify the type and distribution of the amino acids at each position. Here w...
متن کاملMulti-line split DNA synthesis: a novel combinatorial method to make high quality peptide libraries
BACKGROUND We developed a method to make a various high quality random peptide libraries for evolutionary protein engineering based on a combinatorial DNA synthesis. RESULTS A split synthesis in codon units was performed with mixtures of bases optimally designed by using a Genetic Algorithm program. It required only standard DNA synthetic reagents and standard DNA synthesizers in three lines....
متن کاملAn optimization technique for vendor selection with quantity discounts using Genetic Algorithm
Vendor selection decisions are complicated by the fact that various conflicting multi-objective factors must be considered in the decision making process. The problem of vendor selection becomes still more compli-cated with the inclusion of incremental discount pricing schedule. Such hard combinatorial problems when solved using meta heuristics produce near optimal solutions. This paper propose...
متن کامل