Op-cbio130362 1938..1945

نویسندگان

  • Zhi-Zhong Chen
  • Fei Deng
  • Lusheng Wang
چکیده

Motivation: Haplotypes play a crucial role in genetic analysis and have many applications such as gene disease diagnoses, association studies, ancestry inference and so forth. The development of DNA sequencing technologies makes it possible to obtain haplotypes from a set of aligned reads originated from both copies of a chromosome of a single individual. This approach is often known as haplotype assembly. Exact algorithms that can give optimal solutions to the haplotype assembly problem are highly demanded. Unfortunately, previous algorithms for this problem either fail to output optimal solutions or take too long time even executed on a PC cluster. Results: We develop an approach to finding optimal solutions for the haplotype assembly problem under the minimum-error-correction (MEC) model. Most of the previous approaches assume that the columns in the input matrix correspond to (putative) heterozygous sites. This all-heterozygous assumption is correct for most columns, but it may be incorrect for a small number of columns. In this article, we consider the MEC model with or without the all-heterozygous assumption. In our approach, we first use new methods to decompose the input read matrix into small independent blocks and then model the problem for each block as an integer linear programming problem, which is then solved by an integer linear programming solver. We have tested our program on a single PC [a Linux (x64) desktop PC with i7-3960X CPU], using the filtered HuRef and the NA 12878 datasets (after applying some variant calling methods). With the all-heterozygous assumption, our approach can optimally solve the whole HuRef data set within a total time of 31 h (26 h for the most difficult block of the 15th chromosome and only 5 h for the other blocks). To our knowledge, this is the first time that MEC optimal solutions are completely obtained for the filtered HuRef dataset. Moreover, in the general case (without the all-heterozygous assumption), for the HuRef dataset our approach can optimally solve all the chromosomes except the most difficult block in chromosome 15 within a total time of 12 days. For both of the HuRef and NA12878 datasets, the optimal costs in the general case are sometimes much smaller than those in the allheterozygous case. This implies that some columns in the input matrix (after applying certain variant calling methods) still correspond to false-

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Studies on the metabolism of semen; glycolysis in spermatozoa.

Anderson, J. (1945). The Semen of AnimaLk and its use for Artificial Insemination. Edinburgh: Imperial Bureau of Animal Breeding and Genetics. Ball, E. G. & Meyerhof, B. (1940). J. biol. Chem. 134, 483. Friedemann, F., Cotonio, M. & Shaffer, P. (1929). J. biol. Chem. 82, 23. Gray, J. (1928). Brit. J. exp. Biol. 5, 345. Hammond, J. (1930). Brit. J. exp. Biol. 7, 175. Iwanow, E. E. (1931). Z. Zic...

متن کامل

Synergistic Effect of Sulphanilyl Benzamide on V. Choleræ Bacteriophage

From this laboratory De and Basu (1938) noted that sulphanilamide exerts a synergistic effect on the staphylococcus antitoxic and antibacterial serum in the treatment of staphylococcal infections. In the year 1939 Zaytzeff-Jern and Meleney again observed that sulphanilamide derivatives do not interfere significantly with the lytic action of specific bacteriophage. Accordingly, it was also consi...

متن کامل

Designations of the lectotypes for the oriental species of the genus Carpelimus Leach, 1819 (Coleoptera: Staphylinidae).

The type material of the Oriental Carpelimus species from five museums have been investigated. Thirty four lectotypes are designated for the following species: Trogophloeus bengalensis Cameron, 1930; T. bicolor Cameron, 1940; T. calcuttanus Bemhauer, 1911; T. chatterjeei Cameron, 1930; T. congruus Cameron, 1930;T. coriaceus Cameron, 1930; T. flavipennis Cameron, 1930; T. formosae Cameron, 1940;...

متن کامل

XWH - 06 - 1 - 0102 TITLE : Neurotoxin Mitigation PRINCIPAL INVESTIGATOR :

Our goal was to determine whether chlorpyrifos oxon, dichlorvos, diisopropylXuorophosphate (DFP), and sarin covalently bind tohuman albumin. Human albumin or plasma was treated with organophosphorus (OP) agent at alkaline pH, digested with pepsin at pH2.3, and analyzed by matrix-assisted laser desorption/ionization time-of-Xight (MALDI-TOF) mass spectrometry. Two singly chargedpeaks...

متن کامل

A routine method for testing the sulphonamide sensitivity of organisms causing urinary infections.

Several methods are available for testing the sensitivity of organisms to the sulphonamide drugs (Harper and Cawston, 1945; Evans, 1948; Kokko, 1947), but none of them is entirely suitable for clinical use. The main difficu'ty in using a test similar to that for the antibiotics is the presence of sulphonamide-antagonizing substances in routine media. Such substances were first found in peptone ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013