Sequential Testing for Early Stopping
نویسندگان
چکیده
Online evaluation methods, such as A/B and interleaving experiments, are widely used for search engine evaluation. Since they rely on noisy implicit user feedback, running each experiment takes a considerable time. Recently, the problem of reducing the duration of online experiments has received substantial attention from the research community. However, the possibility of using sequential statistical testing procedures for reducing the time required for the evaluation experiments remains less studied. Such sequential testing procedures allow an experiment to stop early, once the data collected is sufficient to make a conclusion. In this work, we study the usefulness of sequential testing procedures for both interleaving and A/B testing. We propose modified versions of the O’Brien & Fleming and MaxSPRT sequential tests that are applicable for testing in the interleaving scenario. Similarly, for A/B experiments, we assess the usefulness of the O’Brien & Fleming test, as well as that of our proposed MaxSPRT-based sequential testing procedure. In our experiments on datasets containing 115 interleaving and 41 A/B testing experiments, we observe that considerable reductions in the average experiment duration can be achieved by using our proposed tests. In particular, for A/B experiments, the average experiment durations can be reduced by up to 66% in comparison with a single step test procedure, and by up to 44% in comparison with the O’Brien & Fleming test. Similarly, a marked relative reduction of 63% in the duration of the interleaving experiments can be achieved.
منابع مشابه
Group-sequential analysis may allow for early trial termination: illustration by an intra-observer repeatability study
BACKGROUND Group-sequential testing is widely used in pivotal therapeutic, but rarely in diagnostic research, although it may save studies, time, and costs. The purpose of this paper was to demonstrate a group-sequential analysis strategy in an intra-observer study on quantitative FDG-PET/CT measurements, illuminating the possibility of early trial termination which implicates significant poten...
متن کاملRisk of Sequential Estimator of the Failure Rate of Exponential Distribution under Convex Boundary
In this paper the exact determination of the distribution of stopping variable, the moment and risk of sequential estimator of the failure rate of exponential distribution, under convex boundary is obtained. The corresponding Poisson Process is used to derive the exact distribution of stopping variable of sequential estimator of the failure rate. In the end the exact values of mean and risk ...
متن کاملSequential Stopping Rules for Fixed-Sample Acceptance Tests
The occurrence of early failures in a fixed-sample acceptance test, where the sample observations are obtained sequentially, presents an interesting decision problem. I t may be desirable to abandon the test a t an early stage if the conditional probability of passing is small and the testing cost is high. This paper presents a stopping rule based on the maximum-likelihood estimate of total cos...
متن کاملOn Optimal Stopping Problems in Sequential Hypothesis Testing
After a brief survey of a variety of optimal stopping problems in sequential testing theory, we give a unified treatment of these problems by introducing a general class of loss functions and prior distributions. In the context of a one-parameter exponential family, this unified treatment leads to relatively simple sequential tests involving generalized likelihood ratio statistics or mixture li...
متن کاملSequential-Based Approach for Estimating the Stress-Strength Reliability Parameter for Exponential Distribution
In this paper, two-stage and purely sequential estimation procedures are considered to construct fixed-width confidence intervals for the reliability parameter under the stress-strength model when the stress and strength are independent exponential random variables with different scale parameters. The exact distribution of the stopping rule under the purely sequential procedure is approximated ...
متن کامل