PyDPI: Freely Available Python Package for Chemoinformatics, Bioinformatics, and Chemogenomics Studies

نویسندگان

  • Dong-Sheng Cao
  • Yi-Zeng Liang
  • Jun Yan
  • Gui-Shan Tan
  • Qing-Song Xu
  • Shao Liu
چکیده

The rapidly increasing amount of publicly available data in biology and chemistry enables researchers to revisit interaction problems by systematic integration and analysis of heterogeneous data. Herein, we developed a comprehensive python package to emphasize the integration of chemoinformatics and bioinformatics into a molecular informatics platform for drug discovery. PyDPI (drug-protein interaction with Python) is a powerful python toolkit for computing commonly used structural and physicochemical features of proteins and peptides from amino acid sequences, molecular descriptors of drug molecules from their topology, and protein-protein interaction and protein-ligand interaction descriptors. It computes 6 protein feature groups composed of 14 features that include 52 descriptor types and 9890 descriptors, 9 drug feature groups composed of 13 descriptor types that include 615 descriptors. In addition, it provides seven types of molecular fingerprint systems for drug molecules, including topological fingerprints, electro-topological state (E-state) fingerprints, MACCS keys, FP4 keys, atom pair fingerprints, topological torsion fingerprints, and Morgan/circular fingerprints. By combining different types of descriptors from drugs and proteins in different ways, interaction descriptors representing protein-protein or drug-protein interactions could be conveniently generated. These computed descriptors can be widely used in various fields relevant to chemoinformatics, bioinformatics, and chemogenomics. PyDPI is freely available via https://sourceforge.net/projects/pydpicao/.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ChemoPy: freely available python package for computational biology and chemoinformatics

MOTIVATION Molecular representation for small molecules has been routinely used in QSAR/SAR, virtual screening, database search, ranking, drug ADME/T prediction and other drug discovery processes. To facilitate extensive studies of drug molecules, we developed a freely available, open-source python package called chemoinformatics in python (ChemoPy) for calculating the commonly used structural ...

متن کامل

A Python package for parsing, validating, mapping and formatting sequence variants using HGVS nomenclature

UNLABELLED Biological sequence variants are commonly represented in scientific literature, clinical reports and databases of variation using the mutation nomenclature guidelines endorsed by the Human Genome Variation Society (HGVS). Despite the widespread use of the standard, no freely available and comprehensive programming libraries are available. Here we report an open-source and easy-to-use...

متن کامل

ProtPOS: a python package for the prediction of protein preferred orientation on a surface

UNLABELLED Atomistic molecular dynamics simulation is a promising technique to investigate the energetics and dynamics in the protein-surface adsorption process which is of high relevance to modern biotechnological applications. To increase the chance of success in simulating the adsorption process, favorable orientations of the protein at the surface must be determined. Here, we present ProtPO...

متن کامل

Goldilocks: a tool for identifying genomic regions that are ‘just right’

UNLABELLED : We present Goldilocks: a Python package providing functionality for collecting summary statistics, identifying shifts in variation, discovering outlier regions and locating and extracting interesting regions from one or more arbitrary genomes for further analysis, for a user-provided definition of interesting. AVAILABILITY AND IMPLEMENTATION Goldilocks is freely available open-so...

متن کامل

PyPanda: a Python package for gene regulatory network reconstruction

PANDA (Passing Attributes between Networks for Data Assimilation) is a gene regulatory network inference method that uses message-passing to integrate multiple sources of 'omics data. PANDA was originally coded in C ++. In this application note we describe PyPanda, the Python version of PANDA. PyPanda runs considerably faster than the C ++ version and includes additional features for network an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of chemical information and modeling

دوره 53 11  شماره 

صفحات  -

تاریخ انتشار 2013