Top-Coding and Public Use Microdata Samples from the U.S. Census Bureau

نویسندگان

  • Nicole Crimi
  • William F. Eddy
چکیده

The US Census Bureau regularly releases Public Use Microdata Samples (PUMS), data files which contain de-identified subsets of the data provided by respondents to some of its various surveys and to the Decennial Census itself. This allows data users to perform “micro” -analyses rather than the “macro” -tabulations which are regularly performed by the Bureau. These data users range from non-government (say, university) researchers to government policymakers. These micro-analyses typically depend on the joint distribution of two or more variables over individuals or households. As a very simple example, think of the relationship of wages of individuals to their individual ages by a linear regression equation. We will use this very simple example throughout this paper to illustrate the effects we are interested in. In order to protect the privacy of the data supplied by respondents, as required by Title 13 U.S.C., the Bureau uses a variety of methods to modify the data so that it is very difficult for data users to identify individual respondents. Although some kind of privacy protection measures are necessary by law, most of them (top-coding, in particular) have a detrimental effect on the micro-analyses because application of these privacy protection measures changes the interdependence of two or more variables and, in many cases, renders the analyses moot.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Minnesota Population Center Data Integration Projects: Challenges of Harmonizing Census Microdata Across Time and Place

The Minnesota Population Center is developing three large historical census microdata series: the Integrated Public Use Microdata Series (IPUMS-USA), the International Integrated Microdata Series (IPUMS-International), and the North Atlantic Population Project (NAPP). Despite many similarities, each database presents particular challenges because of variations in source materials, organization ...

متن کامل

Frozen Film and FOSDIC Forms: Restoring the 1960 U.S. Census of Population and Housing.

In this article, the authors describe a collaboration of the Minnesota Population Center (MPC), the U.S. Census Bureau, and the National Archives and Records Administration to restore the lost data from the 1960 Census. The data survived on refrigerated microfilm in a cave in Lenexa, Kansas. The MPC is now converting the data to usable form. Once the restored data are processed, the authors int...

متن کامل

The Microdata Analysis System at the U.S. Census Bureau

The U.S. Census Bureau has the responsibility to release high quality data products while maintaining the confidentiality promised to all respondents under Title 13 of the U.S. Code. This paper describes a Microdata Analysis System (MAS) that is currently under development, which will allow users to receive certain statistical analyses of Census Bureau data. such as crosstabulations and regress...

متن کامل

Sampling with Synthesis: A New Approach for Releasing Public Use Census Microdata

Many statistical agencies disseminate samples of census microdata, i.e., data on individual records, to the public. Before releasing the microdata, agencies typically alter identifying or sensitive values to protect data subjects’ confidentiality, for example by coarsening, perturbing, or swapping data. These standard disclosure limitation techniques distort relationships and distributional fea...

متن کامل

When Excessive Perturbation Goes Wrong and Why IPUMS-International Relies Instead on Sampling, Suppression, Swapping, and Other Minimally Harmful Methods to Protect Privacy of Census Microdata

IPUMS-International disseminates population census microdata at no cost for 69 countries. Currently, a series of 212 samples totaling almost a half billion person records are available to researchers. Registration is required for researchers to gain access to the microdata. Statistics from Google Analytics show that IPUMS-International's lengthy, probing registration form is an effective deterr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014