Map-Reduce Expansion of the ISGA Genomic Analysis Web Server
نویسندگان
چکیده
Biological sequence data can be subjected to a variety of analysis workflows to glean pertinent scientific insight. Recent advances in sequencing techniques have led to a deluge of biosequence data, which necessitates the use of high-performance computing resources in order to carry out analysis in a reasonable period of time. The tasks involved in creating and managing these computational jobs, though, can be daunting to typical biology researchers, which has lead to the emergence of portal software architectures that abstract many of the details in building and executing computational pipelines. This paper presents a brief overview of one of these genome annotation servers, Integrative Services for Genomics Analysis (ISGA), and then describes a simple extension to the underlying workflow system that leverages the powerful Twister Iterative Map-Reduce runtime for streamlined dataparallel job control and enhanced access to clusters, grids, and Cloud resources. The accompanying live demonstration will showcase ISGA’s Workbench for submitting independent BLAST jobs as well as the use of the Twister interface to expand resource access for this utility. Keywords-biosequence; bioinformatics; ISGA; BLAST; Twister; map-reduce; genome annotation
منابع مشابه
Query expansion based on relevance feedback and latent semantic analysis
Web search engines are one of the most popular tools on the Internet which are widely-used by expert and novice users. Constructing an adequate query which represents the best specification of users’ information need to the search engine is an important concern of web users. Query expansion is a way to reduce this concern and increase user satisfaction. In this paper, a new method of query expa...
متن کاملA density based clustering approach to distinguish between web robot and human requests to a web server
Today world's dependence on the Internet and the emerging of Web 2.0 applications is significantly increasing the requirement of web robots crawling the sites to support services and technologies. Regardless of the advantages of robots, they may occupy the bandwidth and reduce the performance of web servers. Despite a variety of researches, there is no accurate method for classifying huge data ...
متن کاملInvestigation on Reliability Estimation of Loosely Coupled Software as a Service Execution Using Clustered and Non-Clustered Web Server
Evaluating the reliability of loosely coupled Software as a Service through the paradigm of a cluster-based and non-cluster-based web server is considered to be an important attribute for the service delivery and execution. We proposed a novel method for measuring the reliability of Software as a Service execution through load testing. The fault count of the model against the stresses of users ...
متن کاملEGassembler: online bioinformatics service for large-scale processing, clustering and assembling ESTs and genomic DNA fragments
Expressed sequence tag (EST) sequencing has proven to be an economically feasible alternative for gene discovery in species lacking a draft genome sequence. Ongoing large-scale EST sequencing projects feel the need for bioinformatics tools to facilitate uniform EST handling. This brings about a renewed importance for a universal tool for processing and functional annotation of large sets of EST...
متن کاملDesigning a Volunteer Geographic Information-based service for rapid earth quake damages estimation
Designing a Volunteer Geographic Information-based service for rapid earth quake damages estimation Introduction The advent of Web 2.0 enables the users to interact and prepare free unlimited real time data. This advantage leads us to exploit Volunteer Geographic Information (VGI) for real time crisis management. Traditional estimation methods for earthquake damages are expensive and tim...
متن کامل