Parameter Sweep Workflows for Modelling Carbohydrate Recognition 1/3 Parameter Sweep Workflows for Modelling Carbohydrate Recognition
نویسندگان
چکیده
Carbohydrate recognition is a phenomenon critical to a number of biological functions in humans including highly specific responses of the immune system and the selective synthesis of functional proteins. Unlike polypeptides and proteins, oligosaccharides have been observed to have dynamic properties with extensive ability to occupy different conformations over time and space. Understanding the dynamic behaviour of oligosaccharides should provide clues towards the mechanisms which lead to specific and selective recognition of carbohydrates by proteins. Computer programs which can provide insight into biological recognition processes have significant potential to contribute to biomedical research if the results of the simulation can prove consistent with the outcome of conventional wet laboratory experiments. Introduction and validation of these in silico tools would enable bioscientists to focus their resources and better plan experiments by allowing them to visualise potential interactions and determine the best molecules to investigate in the wet laboratory. This not only reduces time and cost but also increases the numbers of molecules screened. Unfortunately, there are several reasons why bio-molecular simulation packages are still not widely used among researchers. The simulations are usually very time-consuming where one simulation run can take weeks or months to complete on a single PC. Also, the command line interfaces provided by current simulation packages are far too complex for an average biologist. These tools do not provide support for automatic parameter sweeps or workflow type of execution either, that is often required by more complex scenarios. Finally, as a result of relatively low utilisation, these packages are not yet validated and tested to the required level. The aim of our work is to provide solutions for all the above problems and to create a generic framework that can be utilised by bio-scientists to run massively parallel simulation workflows from a high level user-friendly environment. The solution allows bio-scientists with no Grid or parallel computing background to easily customise, parameterise, run and analyse complex simulation scenarios, and to provide useful feedback regarding the validation and refinement of in silico modelling tools. In order to achieve these objectives, a generic high level user support and execution environment is required with the following main functionalities: (i) provide an easy to use preferably Web-based user interface for the non computer literate biologist end-user; (ii) provide an intuitive developer interface for selected computer trained biologists; (iii) support the creation of computational workflows orchestrating the execution of several components; (iv) support the creation of automatic parameter sweeps analysing several parallel scenarios based on different input parameters; (v) allow the mapping of workflow components to distributed computing resources (e.g. the UK National Grid Service); (vi) provide access to robust file storage systems and databases to manage input/output data; and finally, (vii) easily extendable by custom tools (for visualisation of results, for example) based on user demand and requirements. There are several tools available that serve as good candidates for such framework, for example the MyExperiment community portal [1] or the P-GRADE Grid portal [2]. The latter has been selected in our project due to its support for parameter sweep workflow execution and its service status on the UK National Grid Service. However, the primary aim of the work was to identify the level of support and type of user environment that a biologist researcher with no or very limited computing knowledge requires to access and utilise existing einfrastructures. Therefore, other existing tools and environments can also be customised and utilised to support this target user community.
منابع مشابه
Workflow Performance Profiles: Development and Analysis
This paper presents a method for performance profiles development of scientific workflow. It addresses issues related to: workflows execution in a parameter sweep manner, collecting performance information about each workflow task, and analysis of the collected data with statistical learning methods. The main goal of this work is to increase the understanding about the performance of studied wo...
متن کاملCombining Local and Grid Resources in Scientific Workflows
We examine some issues that arise when using both local and Grid resources in scientific workflows. Our previous work addresses and illustrates the benefits of a light-weight and generic workflow engine that manages and optimizes Grid resource usage. Extending on this effort, we here illustrate how a client tool for bioinformatics applications employs the engine to interface with Grid resources...
متن کاملStatistical Modelling of a Preliminary Process for Depolymerisation of Cassava Non-starch Carbohydrate Using Organic Acids and Salt
A preliminary study on statistical modelling of a process for depolymerisation of cassava non-starch carbohydrate using halide salt assisted phosphoric and pyruvic acids were accomplished. The effects of three independent variables namely; acid concentration, potassium iodide salt and duration were studied using the central composite rotatable design on hydrolysis of the cassava non-starch carb...
متن کاملParameter Space Exploration Using Scientific Workflows
In recent years there has been interest in performing parameter space exploration across “scientific workflows”, however, many existing workflow tools are not well suited to this. In this paper we augment existing systems with a small set of special “actors” that implement the parameter estimation logic. Specifically, we discuss a set of new Kepler actors that support both complete and partial ...
متن کاملApplying workflow as a service paradigm to application farming
Task farming is often used to enable parameter sweep for exploration of large sets of initial conditions for large scale complex simulations. Such applications occur very often in life sciences. Available solutions enable to perform parameter sweep by creating multiple job submissions with different parameters. This paper presents an approach to farm workflows, employing service oriented paradi...
متن کامل