Prediction of 2 × 2 tables of change from repeat cluster sampling of marginal counts

نویسنده

  • Steen Magnussen
چکیده

Repeat cluster sampling of a binary (0,1) attribute at time 1 (Y1) and time 2 (Y2) in a finite population of discrete units is considered. All clusters contain m units and a cluster provides the marginal count of ones and zeroes at the two time points only. From these counts, we seek to predict a 2 × 2 table of the rates of no change (π11 = E[Y1Y2], π00 = E[(1 – Y1)(1 – Y2)]) and change (π10 = E[Y1(1 – Y2)], π01 = E[(1 – Y1)Y2]). Two predictors are proposed; one is derived from the temporal correlation of marginal counts and the second from the odds ratio of no change that maximizes a (pseudo-) likelihood of a non-central, hypergeometric distribution. The bias of the first is positive when there is a positive intracluster correlation of Y1, Y2, and Y1Y2, while the bias of the second is negative when the odds ratio of no change is >1. A proposed combined estimator worked well in three examples of change analysis with paired, classified Landsat images of forest cover type and cluster sampling with 3 × 3 arrays of 30 m × 30 m units (pixels). 2 × 2 tables obtained from marginal counts were superior, in terms of mean absolute error, to estimates based on a direct unit-by-unit count when the time 2 image had a root mean square registration error of 0.5 pixel relative to the time 1 image. The proposed method is intended for settings where a direct unit-by-unit estimation of the 2 × 2 table is either compromised or when data (by design) consist of marginal counts from a repeat cluster sampling. Résumé : L’auteur examine l’échantillonnage en grappes d’un attribut binaire (0,1) répété au temps 1 (Y1) et au temps 2 (Y2) dans une population finie d’unités discontinues. Toutes les grappes contiennent m unités et un groupe fournit le dénombrement marginal de zéros et de uns seulement aux deux points dans le temps. À partir de ces dénombrements, nous voulions prédire une table 2 × 2 des taux d’absence de changement (π11 = E[Y1Y2], π00 = E[(1 – Y1)(1 – Y2)]) et de changement (π10 = E[Y1(1 – Y2)], π01 = E[(1 – Y1)Y2]). Deux variables explicatives sont proposées: l’une découle de la corrélation temporelle des dénombrements marginaux et l’autre d’un rapport de cotes d’absence de changement qui maximise la (pseudo) vraisemblance de la distribution hypergéométrique non centrale. Le biais de la première variable explicative est positif quand il y a une corrélation intra-grappes entre Y1, Y2 et Y1Y2 tandis que celui de la seconde variable explicative est négatif quand le rapport de cotes d’absence de changement est supérieur à un. Un estimateur combiné, proposé par l’auteur, fonctionne bien pour trois exemples d’analyse de changement sur des paires d’images Landsat classifiées selon le type de couvert forestier et avec un échantillonnage en grappes avec des ensembles de 3 × 3 de 30 m × 30 m (pixels). Sur la base de l’erreur moyenne absolue, les tables 2 × 2 obtenues par dénombrement marginal étaient supérieures aux estimations basées sur un dénombrement direct unité par unité lorsque la racine carrée de l’erreur d’enregistrement de l’image au temps 2 était de 0,5 pixel par rapport à l’image au temps 1. La méthode proposée est applicable aux dispositifs pour lesquels l’estimation directe unité par unité de la table 2 × 2 est compromise ou à ceux dont les données représentent des dénombrements marginaux obtenus lors d’un échantillonnage en grappes répété. [Traduit par la Rédaction] Magnussen 1713

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PART ONE 1 Prior and Likelihood Choices in the Analysis of Ecological Data

A general statistical framework for ecological inference is presented, and a number of previously proposed approaches are described and critiqued within this framework. In particular, the assumptions that all approaches require to overcome the fundamental nonidentifiability problem of ecological inference are clarified. We describe a number of three-stage Bayesian hierarchical models that are f...

متن کامل

Estimating the number of zero-one multi-way tables via sequential importance sampling

In 2005, Chen et al. introduced a sequential importance sampling (SIS) procedure to analyze zero-one two-way tables with given fixed marginal sums (row and column sums) via the conditional Poisson (CP) distribution. They showed that compared with Monte Carlo Markov chain (MCMC)-based approaches, their importance sampling method is more efficient in terms of running time and also provides an eas...

متن کامل

Prediction in multilevel generalized linear models

We discuss prediction of random effects and of expected responses in multilevel generalized linear models. Prediction of random effects is useful for instance in small area estimation and disease mapping, effectiveness studies and model diagnostics. Prediction of expected responses is useful for planning, model interpretation and diagnostics. For prediction of random effects, we concentrate on ...

متن کامل

Sampling from Discrete Distributions:

At the 2001 FCSM Research Conference, Greene et al. introduced a problem in editing and imputation based on fire data. The editing problem consists of imputing values to cells in a 2x2 contingency table subject to extensive item and unit nonresponse. Mathematically, the nonresponse creates an incomplete 2-way table with partial counts for individual cells and marginal totals. Statistically, the...

متن کامل

ar X iv : a st ro - p h / 01 03 12 7 v 1 8 M ar 2 00 1 CLUSTERING IN DEEP ( SUBMILLIMETRE ) SURVEYS

Hughes & Gaztañaga (2001, see article in these proceedings) have presented realistic simulations to address key issues confronting existing and forthcoming submm surveys. An important aspect illustrated by the simulations is the effect induced on the counts by the sampling variance of the large-scale galaxy clustering. We find factors of up to ∼ 2 − 4 variation (from the mean) in the extracted ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004