Big Data Exploration Via Automated Orchestration of Analytic Workflows
نویسندگان
چکیده
Large-scale data exploration using Big Data platforms requires the orchestration of complex analytic workflows composed of atomic analytic components for data selection, feature extraction, modeling and scoring. In this paper, we propose an approach that uses a combination of planning and machine learning to automatically determine the most appropriate data-driven workflows to execute in response to a user-specified objective. We combine this with orchestration mechanisms and automatically deploy, adapt and manage such workflows across Big Data platforms. We present results of this automated exploration in real settings in healthcare.
منابع مشابه
Survey on Perception of People Regarding Utilization of Computer Science & Information Technology in Manipulation of Big Data, Disease Detection & Drug Discovery
this research explores the manipulation of biomedical big data and diseases detection using automated computing mechanisms. As efficient and cost effective way to discover disease and drug is important for a society so computer aided automated system is a must. This paper aims to understand the importance of computer aided automated system among the people. The analysis result from collected da...
متن کاملPAW: A Platform for Analytics Workflows
Big Data analytics in science and industry are performed on a range of heterogeneous data stores, both traditional and modern, and on a diversity of query engines. Workflows are difficult to design and implement since they span a variety of systems. To reduce development time and processing costs, automation is needed. We present PAW, a platform to manage analytics workflows. PAW enables workfl...
متن کاملDecentralized orchestration of data-centric workflows in Cloud environments
Data-centric and service-oriented workflows are commonly used in scientific research to enable the composition and execution of complex analysis on distributed resources. Although there are a plethora of orchestration frameworks to implement workflows, most of them are unsuitable for executing (enacting) data-centric workflows since they are based on a centralized orchestration engine which can...
متن کاملOn the construction of decentralised service-oriented orchestration systems
Modern science relies on workflow technology to capture, process, and analyse data obtained from scientific instruments. Scientific workflows are precise descriptions of experiments in which multiple computational tasks are coordinated based on the dataflows between them. Orchestrating scientific workflows presents a significant research challenge: they are typically executed in a manner such t...
متن کاملAn Ai Planning Approach for Generating Big Data Workflows
The scale of big data causes the compositions of extract-transform-load (ETL) workflows to grow increasingly complex. With the turnaround time for delivering solutions becoming a greater emphasis, stakeholders cannot continue to afford to wait the hundreds of hours it takes for domain experts to manually compose a workflow solution. This paper describes a novel AI planning approach that facilit...
متن کامل