Integrating Ontology-based Metadata Enrichment into a CMS-based Research Infrastructure
نویسندگان
چکیده
This abstract discusses research under development aiming to create an ecosystem of entities connected to a research institution, such as its researchers and the resources produced. In particular, we are investigating ways of being able to enter metadata descriptions in a uniform way on the one hand, and to expose them in various different formats on the other. Here, we aim at supporting current standards for metadata exchange, such as the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH; Van de Sompel et al., 2004), as well as the Resource Description Framework (RDF) in order to be able to interlink the descriptions with others available on the Linking Open Data (LOD) cloud. For the whole process to integrate smoothly into the existing research infrastructure, our approach relies on the Open Source Content Management System Drupal, as it is at the center of the current infrastructure for managing metadata. While version 7 of Drupal generally provides better support for Semantic Web formalisms than its predecessor, we should mention that we base our current architecture on Drupal 6 – on the one hand due to the fact that many Drupal 7 modules are still in beta status, and on the other hand because the current research infrastructure builds on Drupal 6. As Corlosquet et al. (2009) have shown, Drupal offers a number of solutions for creating LODcompliant resource descriptions, such as modules for generating RDF descriptions alongside XHTML webpages, and for implementing a so-called SPARQL endpoint – an RDF repository that can be directly queried using the standard Semantic Web query language SPARQL . This means that metadata descriptions which have been imported into the Drupal CMS can be exposed not only in the form of human-readable (X)HTML webpages, but also in the form of RDF. However, it needs to be said that while there is a very strong development towards using RDF – and higher-level Semantic Web formalisms in general – for metadata annotation, many infrastructures do not support these technologies at the moment. For example, the Common Language Resources and Technology Infrastructure project (CLARIN; Varádi et al., 2008), which aims at developing an infrastructure that is capable of harvesting metadata represented in different metadata vocabularies – such as DCMI or IMDI (Broeder and Wittenburg, 2006) –, bases its harvesting technologies on OAI-PMH. Here, we can make use of the OAI-PMH Views plugin module of Drupal, which implements a data provider supporting Dublin Core metadata that can be indexed by CLARIN and harvested using OAI-PMH. The above descriptions interoperate with ongoing efforts to formalize all resources connected to our research institution in terms of an OWL ontology, which is being developed on the basis of existing ontologies, such as the AKT Reference Ontology and the OWL-Time Ontology. To give an example of the benefits of such formalizations, consider that research publications are linked to their authors, who are in turn affiliated with specific departments and associated with projects,
منابع مشابه
Metadata Enrichment for Automatic Data Entry Based on Relational Data Models
The idea of automatic generation of data entry forms based on data relational models is a common and known idea that has been discussed day by day more than before according to the popularity of agile methods in software development accompanying development of programming tools. One of the requirements of the automation methods, whether in commercial products or the relevant research projects, ...
متن کاملAutomatic Spatial Metadata Enrichment: Reducing Metadata Creation Burden through Spatial Folksonomies
Metadata plays a key role in facilitating access to up-to-date spatial information and contributes to the finding and delivering of high quality spatial information services to users. In particular, metadata is an important element in functioning and facilitating spatial data infrastructure (SDI) initiatives. With huge amount of spatial information being generated, a spatial application must be...
متن کاملOntology-based Metadata Dictionary for Integrating Heteregeneous Information Sources on the WWW
Semantic heterogeneity has always been one of the most important problems to overcome. A number of systems have been proposed to address this problem, ranging from mediatorbased systems to description logic-based systems to content-descriptive metadata systems. In this paper, we propose an ontology-based metadata dictionary as a basis for solving semantic heterogeneity. First, the ontology-base...
متن کاملLinguistic Watermark 3.0: An RDF Framework and a Software Library for Bridging Language and Ontologies in the Semantic Web
In this paper, we present a framework for representing heterogeneous linguistic resources and for integrating their content with Semantic Web ontologies. This work, which extends and improves previous research conducted by these same authors, articulates into two main results: first, a set of coordinated RDF vocabularies providing descriptors for representing linguistic resources and their soft...
متن کاملIntegrating Dublin Core Metadata for Cultural Heritage Collections Using Ontologies
Metadata interoperability is an active research area, especially for cultural heritage collections, which consist of heterogeneous objects described by a variety of metadata schemas. In this paper we propose an ontology-based metadata interoperability approach, which exploits, in an optimal way, the semantics of metadata schemas. In particular, we propose the use of CIDOC/CRM ontology as a medi...
متن کامل