What's up LOD Cloud? Observing The State of Linked Open Data Cloud Metadata

نویسندگان

  • Ahmad Assaf
  • Aline Senart
  • Raphaël Troncy
چکیده

Linked Open Data (LOD) has emerged as one of the largest collections of interlinked datasets on the web. In order to benefit from this mine of data, one needs to access descriptive information about each dataset (or metadata). However, the heterogeneous nature of data sources reflects directly on the data quality as these sources often contain inconsistent as well as misinterpreted and incomplete metadata information. Considering the significant variation in size, the languages used and the freshness of the data, one realizes that finding useful datasets without prior knowledge is increasingly complicated. We have developed Roomba, a tool that enables to validate, correct and generate dataset metadata. In this paper, we present the results of running this tool on parts of the LOD cloud accessible via the datahub.io API. The results demonstrate that the general state of the datasets needs more attention as most of them suffers from bad quality metadata and lacking some informative metrics that are needed to facilitate dataset search. We also show that the automatic corrections done by Roomba increase the overall quality of the datasets metadata and we highlight the need for manual efforts to correct some important missing information.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Roomba: Automatic Validation, Correction and Generation of Dataset Metadata

Data is being published by both the public and private sectors and covers a diverse set of domains ranging from life sciences to media or government data. An example is the Linked Open Data (LOD) cloud which is potentially a gold mine for organizations and individuals who are trying to leverage external data sources in order to produce more informed business decisions. Considering the significa...

متن کامل

Linking FRBR Entities to LOD through Semantic Matching

In this paper, we present an approach to automatically link FRBR works identi ed in metadata to the corresponding entity in Linked Open Data resources. The main contribution is a basis for semantic enrichment and veri cation of works identi ed in existing metadata. Through experiments, we demonstrate that FRBR works can be identied in the LOD cloud, which provides a solid ground for further work.

متن کامل

Adoption of the Linked Data Best Practices in Different Topical Domains

The central idea of Linked Data is that data publishers support applications in discovering and integrating data by complying to a set of best practices in the areas of linking, vocabulary usage, and metadata provision. In 2011, the State of the LOD Cloud report analyzed the adoption of these best practices by linked datasets within different topical domains. The report was based on information...

متن کامل

The Open Linguistics Working Group: Developing the Linguistic Linked Open Data Cloud

The Open Linguistics Working Group (OWLG) brings together researchers from various fields of linguistics, natural language processing, and information technology to present and discuss principles, case studies, and best practices for representing, publishing and linking linguistic data collections. A major outcome of our work is the Linguistic Linked Open Data (LLOD) cloud, an LOD (sub-)cloud o...

متن کامل

Creating and Publishing Metadata of Linked Data —Providing Shoes for the Cobbler’s Children

The number of open datasets available on the web is increasing rapidly with the rise of the Linked Open Data (LOD) cloud and various governmental efforts for releasing public data. However, the metadata available for the datasets is often minimal, heterogeneous, and distributed, which makes finding a suitable dataset for a given need problematic. To address the problem, we present a distibuted ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015