Confidence intervals for negative binomial random variables of high dispersion.

نویسندگان

  • David Shilane
  • Steven N Evans
  • Alan E Hubbard
چکیده

We consider the problem of constructing confidence intervals for the mean of a Negative Binomial random variable based upon sampled data. When the sample size is large, it is a common practice to rely upon a Normal distribution approximation to construct these intervals. However, we demonstrate that the sample mean of highly dispersed Negative Binomials exhibits a slow convergence in distribution to the Normal as a function of the sample size. As a result, standard techniques (such as the Normal approximation and bootstrap) will construct confidence intervals for the mean that are typically too narrow and significantly undercover at small sample sizes or high dispersions. To address this problem, we propose techniques based upon Bernstein's inequality or the Gamma and Chi Square distributions as alternatives to the standard methods. We investigate the impact of imposing a heuristic assumption of boundedness on the data as a means of improving the Bernstein method. Furthermore, we propose a ratio statistic relating the Negative Binomial's parameters that can be used to ascertain the applicability of the Chi Square method and to provide guidelines on evaluating the length of all proposed methods. We compare the proposed methods to the standard techniques in a variety of simulation experiments and consider data arising in the serial analysis of gene expression and traffic flow in a communications network.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Growth Estimators and Confidence Intervals for the Mean of Negative Binomial Random Variables with Unknown Dispersion

The negative binomial distribution becomes highly skewed under extreme dispersion. Even at moderately large sample sizes, the sample mean exhibits a heavy right tail. The standard normal approximation often does not provide adequate inferences about the data’s expected value in this setting. In previous work, we have examined alternative methods of generating confidence intervals for the expect...

متن کامل

Comparison of five introduced confidence intervals for the binomial proportion

So far many confidence intervals were introduced for the binomial proportion. In this paper, our purpose is comparing five well known based on their exact confidence coefficient and average coverage probability.

متن کامل

Maximum Likelihood Estimation of the Negative Binomial Dispersion Parameter for Highly Overdispersed Data, with Applications to Infectious Diseases

BACKGROUND The negative binomial distribution is used commonly throughout biology as a model for overdispersed count data, with attention focused on the negative binomial dispersion parameter, k. A substantial literature exists on the estimation of k, but most attention has focused on datasets that are not highly overdispersed (i.e., those with k>or=1), and the accuracy of confidence intervals ...

متن کامل

Binomial Distribution Sample Confidence Interval Estimation for Positive and Negative Likelihood Ratio Medical Key Parameters

Likelihood Ratio medical key parameters calculated on categorical results from diagnostic tests are usually express accompanied with their confidence intervals, computed using the normal distribution approximation of binomial distribution. The approximation creates known anomalies,especially for limit cases. In order to improve the quality of estimation, four new methods (called here RPAC, RPAC...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • The international journal of biostatistics

دوره 6 1  شماره 

صفحات  -

تاریخ انتشار 2010