Combinatorial Optimization Algorithms for Metabolic Networks Alignments and Their Applications

نویسندگان

  • Qiong Cheng
  • Alex Zelikovsky
چکیده

The accumulation of high-throughput genomic and proteomic data allows for reconstruction of large and complex metabolic networks. To analyze accumulated data and reconstructed networks, it is critical to identify network patterns and evolutionary relations between metabolic networks; finding similar networks is computationally challenging. Based on gene duplication and function sharing in biological networks, a network alignment problem is formulated that asks the optimal vertex-to-vertex mapping allowing path contraction, different types of vertex deletion, and vertex insertions. This paper presents fixed parameter tractable combinatorial optimization algorithms, which take into account the similarity of both the enzymes’ functions arbitrary network topologies. Results are evaluated by the randomized P-Value computation. The authors perform pairwise alignments of all pathways for four organisms and find a set of statistically significant pathway similarities. The network alignment is used to identify pathway holes that are the result of inconsistencies and missing enzymes. The authors propose a framework of filling pathway holes by including database searches for missing enzymes and proteins with the matching prosites and further finding potential candidates with high sequence similarity. for queries on pathway components such as substrates, products and reactions. However, computational tools are called for transferring the well studied knowledge in databases to unknown one (e.g. for searching for homologues to a query pathway in a collection of known pathways) and for aligning two pathways to locate conserved pathway fragments. DOI: 10.4018/jkdb.2011010101 2 International Journal of Knowledge Discovery in Bioinformatics, 2(1), 1-23, January-March 2011 Copyright © 2011, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. Most of the high-throughput genomic, proteomic data are not catalogued or processed directly by hand but by computational technologies. It means that the high-throughput datasets may be filled with technical and biological noise and have different technical biases and coverage (Han, 2008) or the usage of inconsistent data/ tool versions. All of them requires data curation. However not all data are curated. For BioCyc (2010), only a part of pathway data have received person-decades of literature-based curation and are the most accurate. However the logical interpretation of the whole network is definitely not easily comprehensible to the human brain and therefore it is impossible to curate all the more and more increasing data only by hand. The computational tool is required for inferring the missing/inconsistent information in genome and pathway databases. All of them require efficient computational methods for analyzing and comparing networks. In this paper, we focus on network alignment for comparing, exploring, and predicting these networks. Let the pattern be a pathway for which one is searching for homologous pathways in the text, i.e., the known metabolic network of a different species. Alignment of two networks, pattern and text, is a basic task which can meet the requirement of a series of open questions such as network evolution and critical target search. This is a challenging research topic from both biological and computational perspective. Existing approaches to subgraph isoand homeomorphism restrict the size (Sharan et al., 2005; Yang & Sze, 2007) or topology of the pattern (Chen & Hofestaedt, 2004, 2005; Pinter, Rokhlenko, Yeger-Lotem, & Ziv-Ukelson, 2005; Cheng, Harrison, & Zelikovsky, 2007; Cheng, Kaur, Harrison, & Zelikovsky, 2007) or use hueristics and approximation algorithms. GraphMatch (Yang & Sze, 2007) allows to delete disassociated vertices or induced subnetwork in query network and then align its remainder to target network by subgraph isomorphism. However, the widespread evolutionary machinery of gene duplication results in vertex copying (Sharan & Ideker, 2006). The results of network alignments can be enhanced when gene duplication and divergence are taken into account. If two enzymes in the pattern species are evolutionarily related, the corresponding enzymes can be mapped into a single enzyme. The mapping explores the characteristics of graph homomorphism. Based on the property, we have formulated the network alignment problem which asks the optimal vertex-to-vertex mapping allowing path contraction, vertex deletion, and vertex insertions. In this paper we present combinatorial algorithms, which take into account the similarity of both the enzymes’ functions arbitrary network topologies such as trees and arbitrary graphs. The proposed algorithm is fixed parameter tractable in the liner or square of the size of feedback vertex set respectively for the case of disallowing or allowing the deletions. We have developed the web service tool MetNetAligner which aligns metabolic networks. We evaluated our results by the randomized P-Value computation. In the computation, we followed two standard randomization procedures and further developed two other random graph generators which meet the more stringent and consistent topology constraints. By comparing their distribution of the significant alignment pairs, we observed that the more stringent constraints in the topology the random graph generator has, the more pairs of significant alignments there exist. We applied network alignments (Figure 1) to identifying pathway holes and proposed a framework to find potential candidates for filling the holes. The pathway holes may be the result of ambiguity in identifying a gene and its product in an organism or when the gene encoding an enzyme is not identified in an organism’s genome (Green & Karp, 2004). Due to gaps in sequence motif research, several sequences may not get specific annotations. Specific function of a protein may not be known during annotation. Reactions catalyzed by those proteins result in metabolic pathway holes. An error in reading an ORF (open reading frame) may also lead to a pathway hole. With further 21 more pages are available in the full version of this document, which may be purchased using the "Add to Cart" button on the product's webpage: www.igi-global.com/article/combinatorial-optimizationalgorithms-metabolic-networks/52768?camid=4v1 This title is available in InfoSci-Journals, InfoSci-Journal Disciplines Medicine, Healthcare, and Life Science. Recommend this product to your librarian: www.igi-global.com/e-resources/libraryrecommendation/?id=2

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modified particle swarm optimization algorithm to solve location problems on urban transportation networks (Case study: Locating traffic police kiosks)

Nowadays, traffic congestion is a big problem in metropolises all around the world. Traffic problems rise with the rise of population and slow growth of urban transportation systems. Car accidents or population concentration in particular places due to urban events can cause traffic congestions. Such traffic problems require the direct involvement of the traffic police, and it is urgent for the...

متن کامل

Winner Determination in Combinatorial Auctions using Hybrid Ant Colony Optimization and Multi-Neighborhood Local Search

A combinatorial auction is an auction where the bidders have the choice to bid on bundles of items. The WDP in combinatorial auctions is the problem of finding winning bids that maximize the auctioneer’s revenue under the constraint that each item can be allocated to at most one bidder. The WDP is known as an NP-hard problem with practical applications like electronic commerce, production manag...

متن کامل

 Structure Learning in Bayesian Networks Using Asexual Reproduction Optimization

A new structure learning approach for Bayesian networks (BNs) based on asexual reproduction optimization (ARO) is proposed in this letter. ARO can be essentially considered as an evolutionary based algorithm that mathematically models the budding mechanism of asexual reproduction. In ARO, a parent produces a bud through a reproduction operator; thereafter the parent and its bud compete to survi...

متن کامل

Ant Algorithms for Discrete Optimization

This article presents an overview of recent work on ant algorithms, that is, algorithms for discrete optimization that took inspiration from the observation of ant colonies' foraging behavior, and introduces the ant colony optimization (ACO) metaheuristic. In the first part of the article the basic biological findings on real ants are reviewed and their artificial counterparts as well as the AC...

متن کامل

Finding the Shortest Hamiltonian Path for Iranian Cities Using Hybrid Simulated Annealing and Ant Colony Optimization Algorithms

  The traveling salesman problem is a well-known and important combinatorial optimization problem. The goal of this problem is to find the shortest Hamiltonian path that visits each city in a given list exactly once and then returns to the starting city. In this paper, for the first time, the shortest Hamiltonian path is achieved for 1071 Iranian cities. For solving this large-scale problem, tw...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IJKDB

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2011