MoSGrid: efficient data management and a standardized data exchange format for molecular simulations in a grid environment
نویسندگان
چکیده
The MoSGrid (Molecular Simulation Grid) project is currently establishing a platform that aims to be used by both experienced and inexperienced researchers to submit molecular simulation calculations, monitor their progress, and retrieve the results. It provides a web-based portal to easily set up, run, and evaluate molecular simulations carried out on D-Grid resources. The range of applications available encompasses quantum chemistry, molecular dynamics, and protein-ligand docking codes. In addition, data repositories were developed, which contain the results of calculations as well as “recipes” or workflows. These can be used, improved, and distributed by the users. A distributed high-throughput file system allows efficient access to large amounts of data in the repositories. For storing both the input and output of the calculations, we have developed MSML (Molecular Simulation Markup Language), a CML derivative (Chemical Markup Language). MSML has been designed to store structural information on small as well as large molecules and results from various molecular simulation tools and docking tools. It ensures interoperability of different tools through a consistent data representation. At www.mosgrid.de the new platform is already available to the scientific community in a beta test phase. Currently, portlets for generic workflows, Gaussian, and Gromacs applications are publicly accessible [1,2].
منابع مشابه
MoSGrid – a molecular simulation grid as a new tool in computational chemistry, biology and material science
The MoSGrid (Molecular Simulation Grid, http://www. mosgrid.de) project aims to provide remote computational chemistry services within the German Grid Initiative (D-Grid). Submission and monitoring of compute jobs, as well as the retrieval of postprocessed results are realized through a web based portal. The use of standardized portlets and a generally modular approach allows for the simultaneo...
متن کاملGrid-Workflows in Molecular Science
Computational Chemistry gathers information about properties of molecules based on compute intensive simulations. In this area, workflows are an essential instrument for managing complex simulations cascades. The aim of the MoSGrid project is an easy to use Grid integration of such workflows based on a portal that covers the complexity. This paper presents an initial general description of work...
متن کاملA Survey of Dynamic Replication Strategies for Improving Response Time in Data Grid Environment
Large-scale data management is a critical problem in a distributed system such as cloud,P2P system, World Wide Web (WWW), and Data Grid. One of the effective solutions is data replicationtechnique, which efficiently reduces the cost of communication and improves the data reliability andresponse time. Various replication methods can be proposed depending on when, where, and howreplicas are gener...
متن کاملAn Efficient Data Replication Strategy in Large-Scale Data Grid Environments Based on Availability and Popularity
The data grid technology, which uses the scale of the Internet to solve storage limitation for the huge amount of data, has become one of the hot research topics. Recently, data replication strategies have been widely employed in distributed environment to copy frequently accessed data in suitable sites. The primary purposes are shortening distance of file transmission and achieving files from ...
متن کاملE2DR: Energy Efficient Data Replication in Data Grid
Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domai...
متن کامل