Exposing Provenance Metadata Using Different RDF Models
نویسندگان
چکیده
A standard model for exposing structured provenance metadata of scientific assertions on the Semantic Web would increase interoperability, discoverability, reliability, as well as reproducibility for scientific discourse and evidence-based knowledge discovery. Several Resource Description Framework (RDF) models have been proposed to track provenance. However, provenance metadata may not only be verbose, but also significantly redundant. Therefore, an appropriate RDF provenance model should be efficient for publishing, querying, and reasoning over Linked Data. In the present work, we have collected millions of pairwise relations between chemicals, genes, and diseases from multiple data sources, and demonstrated the extent of redundancy of provenance information in the life science domain. We also evaluated the suitability of several RDF provenance models for this crowdsourced data set, including the N-ary model, the Singleton Property model, and the Nanopublication model. We examined query performance against three commonly used large RDF stores, including Virtuoso, Stardog, and Blazegraph. Our experiments demonstrate that query performance depends on both RDF store as well as the RDF provenance model.
منابع مشابه
Evaluation of Metadata Representations in RDF stores
The maintenance and use of metadata such as provenance and time-related information is of increasing importance in the Semantic Web, especially for Big Data applications that work on heterogeneous data from multiple sources and which require high data quality. In an RDF dataset, it is possible to store metadata alongside the actual RDF data and several possible metadata representation models ha...
متن کاملScientific Workflow Provenance Metadata Management Using an RDBMS-based RDF Store
Provenance management has become increasingly important to support scientific discovery reproducibility, result interpretation, and problem diagnosis in scientific workflow environments. This paper proposes an approach to provenance management that seamlessly integrates the interoperability, extensibility, and reasoning advantages of Semantic Web technologies with the storage and querying power...
متن کاملRepresenting Microarray Experiment Metadata Using Provenance Models
MAGE (MicroArray and Gene Expression) representations are primarily representations of work ow: a process was used to derive biomaterial A from biomaterial B. This representation is ideally suited for representation using provenance models such as OPM (Open Provenance Model) and PML (Proof Markup Language). We demonstrate methods and tools, MAGE2OPM and MAGE2PML, to convert RDF representations ...
متن کاملRDFProv: A relational RDF store for querying and managing scientific workflow provenance
Article history: Received 12 October 2008 Received in revised form 8 March 2010 Accepted 11 March 2010 Available online 23 March 2010 Provenance metadata has become increasingly important to support scientific discovery reproducibility, result interpretation, and problem diagnosis in scientific workflow environments. The provenance management problem concerns the efficiency and effectiveness of...
متن کاملAutomated Metadata Generation for Linked Data Generation and Publishing Workflows
Provenance and other metadata are essential for determining ownership and trust. Nevertheless, no systematic approaches were introduced so far in the Linked Data publishing workflow to capture them. Defining such metadata remained independent of the rdf data generation and publishing. In most cases, metadata is manually defined by the data publishers (person-agents), rather than produced by the...
متن کامل