Calibrating Prediction Regions
نویسنده
چکیده
Suppose the variable X to be predicted and the learning sample Y" that was observed have a joint distribution, which depends on an unknown parameter 0. The parameter 0 can be finite or infinite dimensional. A prediction region Dn for X is a random set, depending on Yn, that contains X with prescribed probability a. This paper studies methods for controlling simultaneously the conditional coverage probability of Dn, given Yn, and the overall (unconditional) coverage probability of Dn. The basic construction yields a prediction region Dn which has the following properties in regular models: Both the conditional and overall coverage probabilities of Dn converge to a as the size n of the leaming sample increases. The convergence of the former is in probability. Moreover, the asymptotic distribution of the conditional coverage probability about a is typically normal; and the overall coverage probability tends to a at rate ni1. Can one reduce the dispersion of the conditional coverage probability about a and increase the rate at which overall coverage probability converges to a? Both issues are addressed. The paper establishes a lower bound for the asymptotic dispersion of conditional coverage probability. The paper also shows how to calibrate Dn so as to make its overall coverage probability converge to a at the faster rate n-2. This calibration adjustment does not affect the asymptotic distribution or dispersion of the conditional coverage probability, in a first-order analysis. In general, a bootstrap Monte Carlo algorithm accomplishes the calibration of Dn. In special cases, analytical calibration is possible. * This research was supported in part by NSF Grant DMS 87-01426. Part of the work was done while the author was a guest of Sonderforschungsbereich 123 at Universitgt Heidelberg. The author thanks G. Sawitzki and F. Seillier for helpful comments.
منابع مشابه
Weather on Target
The long term goal of this project is to develop tools for the automated analysis and nowcasting of conventional and remotely-sensed meteorological data, primarily for use at regional forecast centers, and aboard ship in remote littoral regions in which the Navy operates. In contrast to predicting weather out to several days (the purpose of numerical weather prediction), the purpose here is to ...
متن کاملComparing Efficiency of Hill Slope Erosion Model (HEM) in Dry and Abandoned Land (Case Study: Khosbijan Research Center, Arak
Soil erosion and sediment yield from watersheds confine sustainable use of land resources and is supposed as one of the most critical environmental issues. Prediction of storm wise soil erosion and sediment yield is very important, especially in arid and semiarid regions due to small number of events and high intensity of rainfall. Evaluation of soil erosion by existing models is needed as an i...
متن کاملAccounting for outcome and process measures in dynamic decision-making tasks through model calibration
Computational models of learning and the theories they represent are often validated by calibrating them to human data on decision outcomes. However, only a few models explain the process by which these decision outcomes are reached. We argue that models of learning should be able to reflect the process through which the decision outcomes are reached, and validating a model on the process is li...
متن کاملABSTRACT Title of Dissertation: A STRATEGY FOR CALIBRATING THE HSPF MODEL
Title of Dissertation: A STRATEGY FOR CALIBRATING THE HSPF MODEL Angélica L. Gutiérrez-Magness, Doctor of Philosophy, 2005 Dissertation directed by: Professor Richard H. McCuen Department of Civil and Environmental Engineering The development of Total Maximum Daily Loads (TMDLs) and environmental policies rely on the application of mathematical models, both empiric and deterministic. The Hydrol...
متن کاملDiversity regularization in deep ensembles
Calibrating the confidence of supervised learning models is important for a variety of contexts where the certainty over predictions should be reliable. However, it as been reported that deep neural network models are often too poorly calibrated for achieving complex tasks requiring reliable uncertainty estimates in their prediction. In this work, we are proposing a strategy for training deep e...
متن کاملCalibrating fixed- and mixed-effects taper equations
Accurate and affordable measurements of upper-stem diameters are now possible thanks to recent advances in laser technology. Measurement of the midpoint upper-stem diameter can be employed to improve the accuracy of diameter predictions along the tree bole. Felled-tree data from a loblolly pine (Pinus taeda L.) plantation was used to evaluate two approaches: (1) calibrating a segmented taper eq...
متن کامل