Job Provenance - Insight into Very Large Provenance Datasets

نویسندگان

  • Ales Krenek
  • Ludek Matyska
  • Jirí Sitera
  • Miroslav Ruda
  • Frantisek Dvorák
  • Jiri Filipovic
  • Zdenek Sustr
  • Zdenek Salvet
چکیده

Following the job-centric monitoring concept, Job Provenance (JP) service organizes provenance records on the per-job basis. It is designed to manage very large number of records, as was required in the EGEE project where it was developed originally. The quantitative aspect is also a focus of the presented demonstration. We show JP capability to retrieve data items of interest from a large dataset of full records of more than 1 million of jobs, to perform nontrivial transformation on those data, and organize the results in such a way that repeated interactive queries are possible. The application area of the demo is derived from that of previous Provenance Challenges. Though the topic of the demo—a computational experiment— is arranged rather artificially, the demonstration still delivers its main message that JP supports non-trivial transformations and interactive queries on large data sets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Provenir ontology: Towards a Framework for eScience Provenance Management

Management Satya S. Sahoo, Amit P. Sheth Kno.e.sis center, Computer Science and Engineering Department, Wright State University, Dayton, OH-45324, USA {sahoo.2, amit.sheth}@wright.edu Abstract Provenance metadata describes the “lineage” or history of an entity and necessary information to verify the quality of data, validate experiment protocols, and associate trust value with scientific result...

متن کامل

Provenance Capture Disparities Highlighted through Datasets

Provenance information is inherently affected by the method of its capture. Different capture mechanisms create very different provenance graphs. In this work, we describe an academic use case that has corollaries in offices everywhere. We also describe two distinct possibilities for provenance capture methods within this domain. We generate three datasets using these two capture methods: the c...

متن کامل

Exploring Provenance in a Distributed Job Execution System

We examine provenance in the context of a distributed job execution system. It is crucial to capture provenance information during the execution of a job in a distributed environment because often this information is lost once the job has finished. In this paper we discuss the type of information that is available within a distributed job execution system, how to capture such information, and w...

متن کامل

Characterizing users' visual analytic activity for insight provenance

Insight provenance—a historical record of the process and rationale by which an insight is derived—is an essential requirement in many visual analytics applications. While work in this area has relied on either manually recorded provenance (e.g., user notes) or automatically recorded event-based insight provenance (e.g., clicks, drags, and key-presses), both approaches have fundamental limitati...

متن کامل

Provenance Algebra and Materialized View-based Provenance Management

Provenance, from the French word „provenir‟ meaning "to come from", describes the lineage of an entity. Provenance is critical information in eScience to accurately interpret scientific results. Though information provenance has been recognized as a hard problem in computing science (British Computing Society, 2004), many fundamental research issues in provenance have yet to be addressed. A com...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008