Maximum Common Substructure-Based Data Fusion in Similarity Searching

نویسندگان

  • Edmund Duesbury
  • John D. Holliday
  • Peter Willett
چکیده

Data fusion has been shown to work very well when applied to fingerprint-based similarity searching, yet little is known of its application to maximum common substructure (MCS)-based similarity searching. Two similarity search applications of the MCS will be focused on here. Typically, the number of bonds in the MCS, as well as the bonds in the two molecules being compared, are used in a similarity coefficient. The power of this technique can be extended using data fusion, where the MCS similarities of a set of reference molecules against one database molecule are fused. This "group fusion" technique forms the first application of the MCS in this work. The other application is that of the chemical hyperstructure. The hyperstructure concept is an alternative form of data fusion, being a hypothetical molecule that is constructed from the overlap of a set of existing molecules. This paper compares fingerprint group fusion (extended-connectivity fingerprints), MCS similarity group fusion, and hyperstructure similarity searching, and describes their relative merits and complementarity in virtual screening. It is concluded that the hyperstructure approach as implemented here is less generally effective than conventional fingerprint approaches.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

fmcsR: a Flexible Maximum Common Substructure Algorithm for Advanced Compound Similarity Searching

Maximum common substructure (MCS) algorithms rank among the most sensitive and accurate methods for measuring structural similarities among small molecules. This utility is critical for many research areas in drug discovery and chemical genomics. The MCS problem is a graph-based similarity concept that is defined as the largest substructure (sub-graph) shared among two compounds (Cao et al., 20...

متن کامل

fmcsR: mismatch tolerant maximum common substructure searching in R

MOTIVATION The ability to accurately measure structural similarities among small molecules is important for many analysis routines in drug discovery and chemical genomics. Algorithms used for this purpose include fragment-based fingerprint and graph-based maximum common substructure (MCS) methods. MCS approaches provide one of the most accurate similarity measures. However, their rigid matching...

متن کامل

A maximum common substructure-based algorithm for searching and predicting drug-like compounds

MOTIVATION The prediction of biologically active compounds is of great importance for high-throughput screening (HTS) approaches in drug discovery and chemical genomics. Many computational methods in this area focus on measuring the structural similarities between chemical structures. However, traditional similarity measures are often too rigid or consider only global similarities between struc...

متن کامل

A structural hierarchy matching approach for molecular similarity/substructure searching.

An approach for molecular similarity/substructure searching based on structural hierarchy matching is proposed. In this approach, small molecules are divided into two categories, acyclic and cyclic forms. The latter are further divided into three structural hierarchies, namely, framework, complicated-, and mono-rings. During searching, the similarity coefficients of a structural query and each ...

متن کامل

Chemical Similarity Searching

This paper reviews the use of similarity searching in chemical databases. It begins by introducing the concept of similarity searching, differentiating it from the more common substructure searching, and then discusses the current generation of fragment-based measures that are used for searching chemical structure databases. The next sections focus upon two of the principal characteristics of a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of chemical information and modeling

دوره 55 2  شماره 

صفحات  -

تاریخ انتشار 2015