Practical Confidence and Prediction Intervals
نویسنده
چکیده
We propose a new method to compute prediction intervals. Especially for small data sets the width of a prediction interval does not only depend on the variance of the target distribution, but also on the accuracy of our estimator of the mean of the target, i.e., on the width of the confidence interval. The confidence interval follows from the variation in an ensemble of neural networks, each of them trained and stopped on bootstrap replicates of the original data set. A second improvement is the use of the residuals on validation patterns instead of on training patterns for estimation of the variance of the target distribution. As illustrated on a synthetic example, our method is better than existing methods with regard to extrapolation and interpolation in data regimes with a limited amount of data, and yields prediction intervals which actual confidence levels are closer to the desired confidence levels. 1 STATISTICAL INTERVALS In this paper we will consider feedforward neural networks for regression tasks: estimating an underlying mathematical function between input and output variables based on a finite number of data points possibly corrupted by noise. We are given a set of Pdata pairs {ifJ, t fJ } which are assumed to be generated according to t(i) = f(i) + e(i) , (1) where e(i) denotes noise with zero mean. Straightforwardly trained on such a regression task, the output of a network o(i) given a new input vector i can be RWCP: Real World Computing Partnership; SNN: Foundation for Neural Networks. Practical Confidence and Prediction Intervals 177 interpreted as an estimate of the regression f(i) , i.e ., of the mean of the target distribution given input i. Sometimes this is all we are interested in: a reliable estimate of the regression f(i). In many applications, however, it is important to quantify the accuracy of our statements. For regression problems we can distinguish two different aspects: the accuracy of our estimate of the true regression and the accuracy of our estimate with respect to the observed output. Confidence intervals deal with the first aspect, i.e. , consider the distribution of the quantity f(i) o(i), prediction intervals with the latter, i.e., treat the quantity t(i) o(i). We see from t(i) o(i) = [f(i) o(i)] + ~(i) , (2) that a prediction interval necessarily encloses the corresponding confidence interval. In [7] a method somewhat similar to ours is introduced to estimate both the mean and the variance of the target probability distribution. It is based on the assumption that there is a sufficiently large data set, i.e., that their is no risk of overfitting and that the neural network finds the correct regression. In practical applications with limited data sets such assumptions are too strict. In this paper we will propose a new method which estimates the inaccuracy of the estimator through bootstrap resampling and corrects for the tendency to overfit by considering the residuals on validation patterns rather than those on training patterns. 2 BOOTSTRAPPING AND EARLY STOPPING Bootstrapping [3] is based on the idea that the available data set is nothing but a particular realization of some unknown probability distribution. Instead of sampling over the "true" probability distribution , which is obviously impossible, one defines an empirical distribution. With so-called naive bootstrapping the empirical distribution is a sum of delta peaks on the available data points, each with probability content l/Pdata. A bootstrap sample is a collection of Pdata patterns drawn with replacement from this empirical probability distribution. This bootstrap sample is nothing but our training set and all patterns that do not occur in the training set are by definition part of the validation set . For large Pdata, the probability that a pattern becomes part of the validation set is (1 l/Pdata)Pdata ~ lie ~ 0.37. When training a neural network on a particular bootstrap sample, the weights are adjusted in order to minimize the error on the training data. Training is stopped when the error on the validation data starts to increase. This so-called early stopping procedure is a popular strategy to prevent overfitting in neural networks and can be viewed as an alternative to regularization techniques such as weight decay. In this context bootstrapping is just a procedure to generate subdivisions in training and validation set similar to k-fold cross-validation or subsampling. On each of the nrun bootstrap replicates we train and stop a single neural network. The output of network i on input vector i IJ is written oi(ilJ ) == or. As "the" estimate of our ensemble of networks for the regression f(i) we take the average output l 1 nrun m(i) == L.: oi(i). n run i=l lThis is a so-called "bagged" estimator [2]. In [5] it is shown that a proper balancing of the network outputs can yield even better results.
منابع مشابه
Confidence Intervals for Neural Networks and Applications to Modeling Engineering Materials
Feedforward neural networks have been theoretically proved to be able to approximate a nonlinear function to any degree of accuracy as long as enough nodes exist in the hidden layer(s) (Hornik et. al. 1989). However, when feedforward neural networks are applied to modeling physical systems in the real world, people care more about their prediction capabilities than accurate modeling abilities. ...
متن کاملNon-Bayesian Estimation and Prediction under Weibull Interval Censored Data
In this paper, a one-sample point predictor of the random variable X is studied. X is the occurrence of an event in any successive visits $L_i$ and $R_i$ :i=1,2…,n (interval censoring). Our proposed method is based on finding the expected value of the conditional distribution of X given $L_i$ and $R_i$ (i=1,2…,n). To make the desired prediction, our approach is on the basis of approximating the...
متن کاملExact maximum coverage probabilities of confidence intervals with increasing bounds for Poisson distribution mean
A Poisson distribution is well used as a standard model for analyzing count data. So the Poisson distribution parameter estimation is widely applied in practice. Providing accurate confidence intervals for the discrete distribution parameters is very difficult. So far, many asymptotic confidence intervals for the mean of Poisson distribution is provided. It is known that the coverag...
متن کاملConfidence Intervals for Lower Quantiles Based on Two-Sample Scheme
In this paper, a new two-sampling scheme is proposed to construct appropriate confidence intervals for the lower population quantiles. The confidence intervals are determined in the parametric and nonparametric set up and the optimality problem is discussed in each case. Finally, the proposed procedure is illustrated via a real data set.
متن کاملBootstrap confidence intervals of CNpk for type‑II generalized log‑logistic distribution
This paper deals with construction of confidence intervals for process capability index using bootstrap method (proposed by Chen and Pearn in Qual Reliab Eng Int 13(6):355–360, 1997) by applying simulation technique. It is assumed that the quality characteristic follows type-II generalized log-logistic distribution introduced by Rosaiah et al. in Int J Agric Stat Sci 4(2):283–292, (2008). Discu...
متن کاملDistribution Free Confidence Intervals for Quantiles Based on Extreme Order Statistics in a Multi-Sampling Plan
Extended Abstract. Let Xi1 ,..., Xini ,i=1,2,3,....,k be independent random samples from distribution $F^{alpha_i}$، i=1,...,k, where F is an absolutely continuous distribution function and $alpha_i>0$ Also, suppose that these samples are independent. Let Mi,ni and M'i,ni respectively, denote the maximum and minimum of the ith sa...
متن کامل