A Bayesian Approach to Graphical Record Linkage and Deduplication

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Bayesian Approach to Graphical Record Linkage and De-duplication

We propose an unsupervised approach for linking records across arbitrarily many files, while simultaneously detecting duplicate records within files. Our key innovation involves the representation of the pattern of links between records as a bipartite graph, in which records are directly linked to latent true individuals, and only indirectly linked to other records. This flexible representation...

متن کامل

A Bayesian Approach to Graphical Record Linkage A Simulation Study

We provide a simulation study based on the model in §2.1 and we simulate data from the NLTCS based on our model, with varying levels of distortion. The varying levels of distortion (0, 0.25%, 0.5%, 1%, 2%, 5%) associated with the simulated data are then run using our MCMC algorithm to assess how well we can match under “noisy data.” Figure 3 illustrates an approximate linear relationship with F...

متن کامل

Implementing a Bayesian Approach to Record Linkage

The Census Coverage Measurement survey-based program estimated household population coverage of the 2010 Decennial Census. Calculating coverage estimates required linking survey person data to census enumerations. For record linkage research, we applied a Bayesian Latent Class Models approach to both 2010 coverage survey data and simulated household data. This paper presents our use of Base SAS...

متن کامل

SMERED: A Bayesian Approach to Graphical Record Linkage and De-duplication

We propose a novel unsupervised approach for linking records across arbitrarily many files, while simultaneously detecting duplicate records within files. Our key innovation is to represent the pattern of links between records as a bipartite graph, in which records are directly linked to latent true individuals, and only indirectly linked to other records. This flexible new representation of th...

متن کامل

Probabilistic Deduplication, Record Linkage and Geocoding

Outline Background and illustrative example Record linkage Applications, privacy and ethics Our project and our tools Data cleaning and standardisation Probabilistic data standardisation and HMMs Blocking / indexing Record pair classification Geocoding Outlook Peter Christen, May 2005 – p.2/28

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of the American Statistical Association

سال: 2016

ISSN: 0162-1459,1537-274X

DOI: 10.1080/01621459.2015.1105807