Modeling Nonnegative Data with Clumping at Zero: A Survey

Authors

  • Alan Agresti
  • Yongyi Min
Abstract:

Applications in which data take nonnegative values but have a substantial proportion of values at zero occur in many disciplines. The modeling of such “clumped-at-zero” or “zero-inflated” data is challenging. We survey models that have been proposed. We consider cases in which the response for the non-zero observations is continuous and in which it is discrete. For the continuous and then the discrete case, we review models for analyzing cross-sectional data. We then summarize extensions for repeated measurement analyses (e.g., in longitudinal studies), for which the literature is still sparse. We also mention applications in which more than one clump can occur and we suggest problems for future research.

Upgrade to premium to download articles

Sign up to access the full text

Already have an account?login

similar resources

Analysis of repeated measures data with clumping at zero.

Longitudinal or repeated measures data with clumping at zero occur in many applications in biometrics, including health policy research, epidemiology, nutrition, and meteorology. These data exhibit correlation because they are measured on the same subject over time or because subjects may be considered repeated measures within a larger unit such as a family. They present special challenges beca...

full text

Modeling zero-inflated count data with glmmTMB

Ecological phenomena are often measured in the form of count data. These data can be analyzed using generalized linear mixed models (GLMMs) when observations are correlated in ways that require random effects. However, count data are often zero-inflated, containing more zeros than would be expected from the standard error distributions used in GLMMs, e.g., parasite counts may be exactly zero fo...

full text

modeling loss data by phase-type distribution

بیمه گران همیشه بابت خسارات بیمه نامه های تحت پوشش خود نگران بوده و روش هایی را جستجو می کنند که بتوانند داده های خسارات گذشته را با هدف اتخاذ یک تصمیم بهینه مدل بندی نمایند. در این پژوهش توزیع های فیزتایپ در مدل بندی داده های خسارات معرفی شده که شامل استنباط آماری مربوطه و استفاده از الگوریتم em در برآورد پارامترهای توزیع است. در پایان امکان استفاده از این توزیع در مدل بندی داده های گروه بندی ...

a new approach to credibility premium for zero-inflated poisson models for panel data

هدف اصلی از این تحقیق به دست آوردن و مقایسه حق بیمه باورمندی در مدل های شمارشی گزارش نشده برای داده های طولی می باشد. در این تحقیق حق بیمه های پبش گویی بر اساس توابع ضرر مربع خطا و نمایی محاسبه شده و با هم مقایسه می شود. تمایل به گرفتن پاداش و جایزه یکی از دلایل مهم برای گزارش ندادن تصادفات می باشد و افراد برای استفاده از تخفیف اغلب از گزارش تصادفات با هزینه پائین خودداری می کنند، در این تحقیق ...

15 صفحه اول

Multivariate Statistical Modeling with Survey Data

We describe an extension of the pseudo maximum likelihood (PML) estimation method developed by Skinner (1989) to multistage strati ̄ed cluster sampling designs, including ̄nite population and unequal probability sampling. We conduct simulation studies to evaluate the performance of the proposed estimator. The estimator is also compared to the general estimating equation (GEE) method for linear r...

full text

My Resources

Save resource for easier access later

Save to my library Already added to my library

{@ msg_add @}


Journal title

volume 1  issue None

pages  7- 33

publication date 2002-11

By following a journal you will be notified via email when a new issue of this journal is published.

Keywords

Hosted on Doprax cloud platform doprax.com

copyright © 2015-2023