Record Linkage: Current Practice and Future Directions

نویسندگان

  • Lifang Gu
  • Rohan Baxter
  • Deanne Vickers
  • Chris Rainsford
چکیده

Record linkage is the task of quickly and accurately identifying records corresponding to the same entity from one or more data sources. Record linkage is also known as data cleaning, entity reconciliation or identification and the merge/purge problem. This paper presents the “standard” probabilistic record linkage model and the associated algorithm. Recent work in information retrieval, federated database systems and data mining have proposed alternatives to key components of the standard algorithm. The impact of these alternatives on the standard approach are assessed. The key question is whether and how these new alternatives are better in terms of time, accuracy and degree of automation for a particular record linkage application.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Overview of Record Linkage and Current Research Directions

This paper provides background on record linkage methods that can be used in combining data from a variety of sources such as person lists business lists. It also gives some areas of current research.

متن کامل

Record Linkage and Tagging for the BYU Historic Journals Project (journals.byu.edu)

We describe techniques used to link resources to new FamilySearch PersonIDs in the BYU Historic Journals Project. We describe both manual (crowd-sourced) tagging, and automatic/semi-automatic record linkage. We also describe possible future research directions.

متن کامل

Extended abstract - Joint Statistical Meetings ( JSM ) , Montreal , August 2013 Overview and taxonomy of techniques for privacy - preserving record linkage

Record linkage is the process of identifying which records in two or more databases correspond to the same real-world entity. Three major challenges of this process are (1) achieving high linkage quality, (2) scalability to linking very large databases, and (3) protecting the privacy and confidentiality of personal identifying data that are used in the linkage process. This presentation provide...

متن کامل

Patient Engagement and its Evaluation Tools – Current Challenges and Future Directions; Comment on “Metrics and Evaluation Tools for Patient Engagement in Healthcare Organization- and System-Level Decision-Making: A Systematic Review”

Considering the growing recognition of the importance of patient engagement in healthcare decisions, research and delivery systems, it is important to ensure high quality and efficient patient engagement evaluation tools. In this commentary, we will first highlight the definition and importance of patient engagement. Then we discuss the psychometric properties of the patient engagement evaluati...

متن کامل

Advanced record linkage methods and privacy aspects for population reconstruction

Recent times have seen an increased interest into techniques that allow the linking of records across databases. The main challenges of record linkage are (1) scalability to the increasingly large databases common today; (2) accurate and efficient classification of compared records into matches and non-matches in the presence of variations and errors in the data; and (3) privacy issues that occ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003