The Simplified Partial Digest Problem: Hardness and a Probabilistic Analysis

نویسندگان

  • Zoë Abrams
  • Ho-Lin Chen
چکیده

Introduction We study the problem of genome mapping using restriction site analysis. In restriction site analysis, an enzyme cuts a target DNA strand into DNA fragments, and these DNA fragments are used to reconstruct the restriction site locations of the enzyme. Two common approaches are the Double Digest Problem and the Partial Digest Problem. The Double Digest Problem is known to be NP-Complete[4], but the hardness of the Partial Digest Problem is unknown, despite there being no known polynomial algorithm[5][7]. Alternative approaches to restriction site analysis use primary fragments, which are DNA fragments with one endpoint on either the left or right side of the target DNA strand. Blazewicz et. al. [1] [2] present a technique for finding primary fragment lengths called a short digestion, in which the reaction time is chosen so that all molecules are cut at most once. In [6], an alternative approach for finding primary fragment lengths is presented: before the digestion experiment, the endpoints of the molecule are labeled (one method is radio-labeling with radioactive phosphate). [6] uses the primary fragments, in addition to information used in the Partial Digest Problem approach, to find a unique reconstruction in polynomial time. However, the information from the Partial Digest Problem is susceptible to experimental errors caused by missing fragments, and therefore it is still useful to develop techniques based on other types of information that are more robust against these types of errors. The Simplified Partial Digest Problem, first proposed in [1], uses primary fragments and base fragments to locate restriction sites. Base fragments have two endpoints that were consecutive sites on the target DNA strand and can be obtained by exposing the strand to the enzyme until the digestion process is complete. We consider there are n restriction sites where the enzyme cuts along a DNA strand of length D. Simplified Partial Digest Problem (SPDP) Statement: Given X0 = 0, Xn+1 = D, and a set of base fragments {Xi Xi-1}1 ≤ i ≤ n+1 and primary fragments {(Xn+1 Xi) ∪ Xi}1 ≤ i ≤ n, reconstruct the original series X1,...,Xn, where Xi corresponds to the distance between the leftmost end of the target DNA strand and the i furthest restriction site along the strand. In this paper, we show that the SPDP is NP-Complete, an open problem in [3]. Therefore, if we desire efficient algorithms for this problem, we must relax our end criteria. We propose an efficient algorithm that in practice finds a solution for many instances of the SPDP.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling of Partial Digest Problem as a Network flows problem

Restriction Site Mapping is one of the interesting tasks in Computational Biology. A DNA strand can be thought of as a string on the letters A, T, C, and G. When a particular restriction enzyme is added to a DNA solution, the DNA is cut at particular restriction sites. The goal of the restriction site mapping is to determine the location of every site for a given enzyme. In partial digest metho...

متن کامل

A Continuous Optimization Model for Partial Digest Problem

The pupose of this paper is modeling of Partial Digest Problem (PDP) as a mathematical programming problem. In this paper we present a new viewpoint of PDP. We formulate the PDP as a continuous optimization problem and develope a method to solve this problem. Finally we constract a linear programming model for the problem with an additional constraint. This later model can be solved by the simp...

متن کامل

Construction of DNA restriction maps based on a simplified experiment

MOTIVATION A formulation of a new problem of the restriction map construction based on a simplified digestion experiment and a development of an algorithm for solving both ideal and noisy data cases of the introduced problem. RESULTS A simplified partial digest problem and a branch and cut algorithm for finding the solution of the problem.

متن کامل

Combinatorial optimization in DNA mapping - a computational thread of the Simplified Partial Digest Problem

In the paper, the problem of the genome mapping of DNA molecules, is presented. In particular, the new approach — the Simplified Partial Digest Problem (SPDP), is analyzed. This approach, although easy in laboratory implementation and robust with respect to measurement errors, when formulated in terms of a combinatorial search problem, is proved to be strongly NP-hard for the general errorfree ...

متن کامل

The Simplified Partial Digest Problem: Enumerative and Dynamic Programming Algorithms

We study the Simplified Partial Digest Problem (SPDP), which is a mathematical model for a new simplified partial digest method of genome mapping. This method is easy for laboratory implementation and robust with respect to the experimental errors. SPDP is NP-hard in the strong sense. We present an Oðn2Þ time enumerative algorithm (ENUM) and an OðnÞ time dynamic programming algorithm for the er...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005