A Block Coding Method that Leads to Significantly Lower Entropy Values for the Proteins and Coding Sections of Haemophilus influenzae
نویسنده
چکیده
A simple statistical block code in combination with the LZW-based compression utilities gzip and compress has been found to increase by a significant amount the level of compression possible for the proteins encoded in Haemophilus influenzae, the first fully sequenced genome. The method yields an entropy value of 3.665 bits per symbol (bps), which is 0.657 bps below the maximum of 4.322 bps and an improvement of 0.452 bps over the best known to date of 4.118 bps using Matsumoto, Sadakane, and Imai's lza-CTW algorithm. Calculations based on a compact inverse genetic code show that the genome has a maximum entropy of 1.757 bps for the coding regions, with a possibly lower actual entropy. These results hint at the existence of hitherto unexplored redundancies that do not show up in Markov models and are indicative of more internal structure than suspected in both the protein and the genome.
منابع مشابه
A block coding method that leads to significantly lower entropy values for the proteins of Haemophilus Influenzae and its coding sections
A simple statistical block code in combination with the LZW-based compression utilities gzip and compress has been found to increase by a significant amount the level of compression possible for the proteins encoded in Haemophilus Influenzae (hi), the first fully sequenced genome. The method yields an entropy of 3.665 bits per symbol (bps), which is 0.657 bps below the maximum of 4.322 bps. Thi...
متن کاملImprovement of Large-scale PRP production by Haemophilus influenzae typeb, using modified CY medium
Background and Objective: Haemophilus influenzae type b (Hib) is a gram negative bacterium and one of the most common causative agents of acute meningitis in infants and less than 5 years old children worldwide. The production of Hib capsular polysaccharide polyribosyl ribitolphosphate (PRP) is important for the production of conjugate vaccines against Hib infections. The aim of this study is t...
متن کاملVaccine Candidates against Nontypeable Haemophilus influenzae: a Review
Nonencapsulated, nontypeable Hemophilus influenzae (NTHi) remains an important cause of acute otitis and respiratory diseases in children and adults. NTHi bacteria are one of the major causes of respiratory tract infections, including acute otitis media, cystic fibrosis, and community-acquired pneumonia among children, especially in developing countries. The bacteria can also cause chronic dise...
متن کاملCloning of conserved regions of nontypeable Haemophilus influenzae hmw1 core binding domain
Colonization of nontypeable Haemophilus influenzae (NTHi) in nasopharynx causes respiratory tract disease. In 80% of clinical isolates, HMW proteins are the major adhesions and induce protective antibodies in the hosts. Therefore, it can be used as a vaccine candidate. The aim of this study is designing and cloning of the conserved regions of NTHi hmw1 core binding domain.In this study, the sta...
متن کاملA Fast Block Size Decision For Intra Coding in HEVC Standard
Intra coding in High efficiency video coding (HEVC) can significantly improve the compression efficiency using 35 intra-prediction modes for 2N×2N (N is an integer number ranging from six to two) luma blocks. To find the luma block with the minimum rate-distortion, it must perform 11932 different rate-distortion cost calculations. Although this approach improves coding efficiency compared to th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Proceedings. IEEE Computer Society Bioinformatics Conference
دوره 2 شماره
صفحات -
تاریخ انتشار 2003