U Uncertain Data Lineage
نویسنده
چکیده
Lineage, also called Boolean provenance, event expression, or why-provenance, is a form of provenance or origin of the answer(s) to a query executed on a database. Lineage is expressed as a Boolean formula with variables assigned to the tuples in the database, where joint usage of the tuples (by the database join operation) is captured by Boolean conjunction (AND, ^) and alternative usage (projection or union) by Boolean disjunction (OR, _). Uncertain data is typically expressed in the form of a probabilistic database, which is a compact representation of a probability distribution over a set of deterministic database instances (called possible worlds). When an input query is evaluated on such a probabilistic database, instead of a deterministic set of tuples representing the answer, the output is a distribution on possible answers for the possible worlds. The query evaluation problem on uncertain data aims to compute this output probability distribution efficiently. Lineages of the answers play a key role in understanding, expressing, and efficiently evaluating the probability distribution of query answers for uncertain data.
منابع مشابه
Towards Special-Purpose Indexes and Statistics for Uncertain Data
The Trio project at Stanford [35] for managing data, uncertainty, and lineage is developed on top of a conventional DBMS. Uncertain data with lineage is encoded in relational tables, and Trio queries are translated to SQL queries on the encoding. Such a layered approach reaps significant benefits in terms of architectural simplicity, and the ability to use an off-the-shelf query processing engi...
متن کاملWidom Databases with Uncertainty and Lineage
This paper introduces ULDBs, an extension of relational databases with simple yet expressive constructs for representing and manipulating both lineage and uncertainty. Uncertain data and data lineage are two important areas of data management that have been considered extensively in isolation, however many applications require the features in tandem. Fundamentally, lineage enables simple and co...
متن کاملData Modifications and Versioning in Trio
This paper presents the first DBMS for uncertain data that incorporates data modifications and a simple versioning system. Our work is in the context of Trio, a project at Stanford for managing data uncertainty and lineage. We establish SQL-based language constructs for data modifications, and an extended data model ULDB that supports these modifications yielding versioned relations. We show th...
متن کاملManaging Uncertain Data a Dissertation Submitted to the Department of Computer Science and the Committee on Graduate Studies of Stanford University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
The ubiquity of uncertain data in modern-day applications (such as information extraction, data integration, sensor and RFID networks, and scientific experiments) has resulted in a growing need for techniques to deal with such data. This thesis addresses challenges in managing uncertain data in a principled, usable, and scalable fashion. We identify and explore a fundamental tension between usa...
متن کاملThe Potential of Menstrual Blood-Derived Stem Cells in Differentiation to Epidermal Lineage: A Preliminary Report
BACKGROUND Menstrual blood-derived stem cells (MenSCs) are a novel source of stem cells that can be easily isolated non-invasively from female volunteered donor without ethical consideration. These mesenchymal-like stem cells have high rate of proliferation and possess multi lineage differentiation potency. This study was undertaken to isolate the MenSCs and assess their potential in differenti...
متن کامل