XClean in Action A Demonstration of Declarative XML Data Cleaning
نویسندگان
چکیده
We demonstrate XClean, a data cleaning system specifically geared towards cleaning XML data. XClean’s approach is based on a set of cleaning operators. Users may specify cleaning programs by combining operators using the declarative XClean/PL language, which is then compiled into XQuery. We plan to show XClean in action on several scenarios based on real-world data. A graphical user interface supports users in writing XClean/PL programs and guides them through the process to obtain the clean data.
منابع مشابه
XClean in Action (Demo)
We demonstrate XClean, a data cleaning system specifically geared towards cleaning XML data. XClean’s approach is based on a set of cleaning operators. Users may specify cleaning programs by combining operators using the declarative XClean/PL language, which is then compiled into XQuery. We plan to show XClean in action on several scenarios based on real-world data. A graphical user interface s...
متن کاملDeclarative XML Data Cleaning with XClean
Data cleaning is the process of correcting anomalies in a data source, that may for instance be due to typographical errors, or duplicate representations of an entity. It is a crucial task in customer relationship management, data mining, and data integration. With the growing amount of XML data, approaches to effectively and efficiently clean XML are needed, an issue not addressed by existing ...
متن کاملDuplicate detection in XML data
Duplicate detection consists in detecting multiple representations of a same real-world object, and that for every object represented in a data source. Duplicate detection is relevant in data cleaning and data integration applications and has been studied extensively for relational data describing a single type of object in a single table. Our research focuses on iterative duplicate detection i...
متن کاملARKTOS: A Tool For Data Cleaning and Transformation in Data Warehouse Environments
Extraction-Transformation-Loading (ETL) and Data Cleaning tools are pieces of software responsible for the extraction of data from several sources, their cleaning, customization and insertion into a data warehouse. To deal with the complexity and efficiency of the transformation and cleaning tasks we have developed a tool, namely ARKTOS, capable of modeling and executing practical scenarios, by...
متن کاملApply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML
As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...
متن کامل