Data quality concepts and techniques applied to taxonomic databases
نویسنده
چکیده
Data Quality Concepts and Techniques Applied to Taxonomic Databases by Eduardo Couto Dalcin The thesis investigates the application of concepts and techniques of data quality in taxonomic databases to enhance the quality of information services and systems in taxonomy. Taxonomic data are arranged and introduced in Taxonomic Data Domains in order to establish a standard and a working framework to support the proposed Taxonomic Data Quality Dimensions, as a specialised application of conventional Data Quality Dimensions in the Taxonomic Data Quality Domains. The thesis presents a discussion about improving data quality in taxonomic databases, considering conventional Data Cleansing techniques and applying generic data content error patterns to taxonomic data. Techniques of taxonomic error detection are explored, with special attention to scientific name spelling errors. The spelling error problem is scrutinized through spelling error detecting techniques and algorithms. Spelling error detection algorithms are described and analysed. In order to evaluate the applicability and efficiency of different spelling error detection algorithms,
منابع مشابه
بررسی میزان آگاهی و استفاده از مفاهیم و پایگاههای اطلاعاتی پزشکی مبتنی بر شواهد در میان دستیاران تخصصی دانشگاه علوم پزشکی شهید بهشتی
Background and Aim: With the development of the Internet and databases and the increasing need to institutionalize evidence-based medicine, physicians' awareness and use of evidence-based medical databases and concepts are considered to be necessary. Therefore, the aim of this study is to evaluate the knowledge and use of evidence-based medical concepts and databases among residents of Shahid B...
متن کاملSparse Structured Principal Component Analysis and Model Learning for Classification and Quality Detection of Rice Grains
In scientific and commercial fields associated with modern agriculture, the categorization of different rice types and determination of its quality is very important. Various image processing algorithms are applied in recent years to detect different agricultural products. The problem of rice classification and quality detection in this paper is presented based on model learning concepts includ...
متن کاملOn the Use of Taxonomic Concepts in Support of Biodiversity Research and Taxonomy
Future biodiversity research will make increased use of distributed data networks, scientific workflows, and powerful mechanisms for resolving a broad spectrum of primary data. This paper outlines the anatomy of an ecological niche modeling workflow and concomitant needs for taxonomic resolution. Contemporary Linnaean names and synonymy relationships are shown to be too imprecise too support th...
متن کاملSecondary Use of Laboratory data: Potentialities and Limitations
Clinical databases have been developed in recent years especially during the course of all medical concerns including laboratory results. The information produced by the diagnostic laboratories have great impact on health care system with various secondary uses. These uses are sometimes as publishing new extracted information of laboratory reports which have been widely applied in the scientifi...
متن کاملA New Group Data Envelopment Analysis Method for Ranking Design Requirements in Quality Function Deployment
Data envelopment analysis (DEA) is an objective method for priority determination of decision making units (DMUs) with the same multiple inputs and outputs. DEA is an efficiency estimation technique, but it can be used for solving many problems of management such as rankig of DMUs. Many researchers have found similarity between DEA and MCDM techniques. One of the earliest techniques in MCDM is...
متن کامل