Archiving and Maintaining Curated Databases

نویسنده

  • Heiko Müller
چکیده

Curated databases represent a substantial amount of effort by a dedicated group of people to produce a definitive description of some subject area. The value of curated databases lies in the quality of the data that has been manually collected, corrected, and annotated by human curators. Many curated databases are continuously modified and new releases being published on the Web. Given that curated databases act as publications, archiving them becomes a necessity to enable retrieval of particular database versions. A system trying to archive evolving databases on the Web faces several challenges. First and foremost, the systems needs to be able to efficiently maintain and query multiple snapshots of ever growing databases. Second, the system needs to be flexible enough to account for changes to the database structure and to handle data of varying quality. Third, the system needs to be robust and invulnerable to local failure to allow reliable long-term preservation of archived information. Our archive management system XArch addresses the first challenge by providing the functionality to maintain, populate, and query archives of database snapshots in hierarchical format. This presentation intends to give an overview of our ongoing efforts of improving XArch regarding (i) archiving evolving databases, (ii) supporting distributed archives, and (iii) using our archives and XArch as the basis of a system to create, maintain, and publish curated databases.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Probabilistic Representations for Integrating Unreliable Data Sources

Databases constructed automatically through web mining and information extraction often overlap with databases constructed and curated by hand. These two types of databases are complementary: automatic extraction provides increased scope, while curated databases provide increased accuracy. The uncertain nature of such integration tasks suggests that the final representation of the merged databa...

متن کامل

Using Links to prototype a Database Wiki

Both relational databases and wikis have strengths that make them attractive for use in collaborative applications. In the last decade, database-backed Web applications have been used extensively to develop valuable shared biological references called curated databases. Databases offer many advantages such as scalability, query optimization and concurrency control, but are not easy to use and l...

متن کامل

Curating the CIA World Factbook

The CIA World Factbook is a prime example of a curated database – a database that is constructed and maintained with a great deal of human effort in collecting, verifying, and annotating data. Preservation of old versions of the Factbook is important for verification of citations; it is also essential for anyone interested in the history of the data such as demographic change. Although the Fact...

متن کامل

Learning from Earthquakes: a Survey of Surveys

This paper presents a literature review of efforts to learn from earthquakes: collecting, archiving, and disseminating information. The emphasis is on primary sources, i.e., data-gathering instruments or investigations that include direct observation of earthquake effects. The study addresses seismology and geotechnical engineering; safety and damage to individual buildings; performance of larg...

متن کامل

University libraries - between service providers and research institutions

In the last years the process of generating, disseminating, and archiving new knowledge has changed fundamentally. Beside the increasing amount of new knowledge that needs to be processed, new paradigms for search, access, and exchange have evolved: digital information is discovered, interlinked with curated databases, commented upon, adapted, and shared in Web-based collaborative research infr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009