Working Paper ENGLISH ONLY UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE (UNECE) CONFERENCE OF EUROPEAN STATISTICIANS EUROPEAN COMMISSION STATISTICAL OFFICE OF THE EUROPEAN
نویسنده
چکیده
IPUMS-International disseminates more than two hundred integrated, confidentialized census microdata samples to thousands of researchers worldwide at no cost. The number of samples is increasing at the rate of several dozen per year, as the process of integrating metadata and microdata is completed. Protecting the statistical confidentiality and privacy of individuals represented in the microdata is a sine qua non of the IPUMS project. For the 2010 round of censuses, even greater protections are required, while researchers are demanding ever higher precision and greater utility. This paper describes a tripartite collaborative experiment using a ten percent household sample of the 2011 census of Ireland to estimate risk, mask the data using controlled shuffling, and assess analytical utility by comparing the masked data against the unprotected source microdata. Controlled shuffling exploits hierarchically ordered coding schemes to protect privacy and enhance utility. With controlled shuffling, the lesson seems to be more detail means less risk and greater utility. Overall, despite substantial perturbations of the masked dataset, we find that data utility is very high and information loss is slight, almost imperceptible even for fairly complex analytical problems.. Acknowledgement. The authors greatly appreciate the cooperation of the Central Statistics Office of Ireland in providing a ten per cent household sample of the 2011 census for this experiment. The authors alone are solely responsible for the contents of this paper. The dataset described herein is solely for experimentation and, as of this writing, the CSO has not approved its release to third parties.
منابع مشابه
Working Paper ENGLISH ONLY UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE (UNECE) CONFERENCE OF EUROPEAN STATISTICIANS EUROPEAN COMMISSION STATISTICAL OFFICE OF THE EUROPEAN
متن کامل
WP.14 ENGLISH ONLY UNITED NATIONS STATISTICAL COMMISSION and ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS EUROPEAN COMMISSION STATISTICAL OFFICE OF THE EUROPEAN COMMUNITIES (EUROSTAT)
متن کامل
Working Paper ENGLISH ONLY UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE (UNECE) CONFERENCE OF EUROPEAN STATISTICIANS EUROPEAN COMMISSION STATISTICAL OFFICE OF THE EUROPEAN
Theoretical methods and software are available for performing optimal complementary cell suppression (CCS) in tables. The released resulting suppression patterns comprise algebraic circuits which define alternative tables for the original table while controlling variation between original and alternative cell values. For an important class of statistical tables including two-way tables, these c...
متن کاملWP. 9 ENGLISH ONLY UNITED NATIONS STATISTICAL COMMISSION and ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS EUROPEAN COMMISSION STATISTICAL OFFICE OF THE EUROPEAN COMMUNITIES (EUROSTAT)
The concept of differential privacy has received considerable attention in the literature recently. In this paper we evaluate the masking mechanism based on Laplace noise addition to satisfy differential privacy. The results of this study indicate that the Laplace based noise addition procedure does not satisfy the requirements of differential privacy.
متن کاملWorking Paper No. 30 ENGLISH ONLY UNITED NATIONS STATISTICAL COMMISSION and ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS EUROPEAN COMMISSION STATISTICAL OFFICE OF THE EUROPEAN COMMUNITIES (EUROSTAT)
In this paper we give an overview of various approaches to the implementation of statistical disclosure control to tabular data released through the Web. We consider three generic groups of statistical disclosure control methods: source data perturbation, output perturbation and query-set restriction. Considering different types of Web-sites and implementation approaches we discuss the appropri...
متن کاملWorking Paper No. 2 ENGLISH ONLY UNITED NATIONS STATISTICAL COMMISSION and ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS EUROPEAN COMMISSION STATISTICAL OFFICE OF THE EUROPEAN COMMUNITIES (EUROSTAT)
In a statistical database, the query-answering system should leave unanswered sum-queries that could lead to the disclosure of confidential data. To this end, each sum-query and previously answered sum-queries should be audited. We give a general framework for controlling the amount of information released when sum-queries are answered, both from the viewpoint of the user and from the viewpoint...
متن کامل