Information Graphs for Epidemiological Applications of the Kullback-Leibler Divergence
ثبت نشده
چکیده
Dear Editor, The topic addressed by this brief communication is the quantification of diagnostic information and, in particular, a method of illustrating graphically the quantification of diagnostic information for binary tests. To begin, consider the following outline procedure for development of such a test. An appropriate indicator variable that will serve as a proxy for the actual variable of interest, often the disease status of a subject, is identified. Two mutually exclusive groups are formed, one of definitively diseased subjects; the other of definitively non-diseased subjects. The classification of subjects into diseased and non-diseased groups is made by a gold standard method, independent of the putative indicator variable. The value of this indicator variable is recorded for all subjects in both groups, leading to separate distributions of indicator scores for the diseased and non-diseased groups. The indicator variable is usually calibrated in such a way that on average, diseased subjects tend to have larger indicator scores than non-diseased subjects. Typically, we find that the two distributions of indicator scores overlap and, in such cases, any choice of a particular threshold indicator score will result in imperfect discrimination between the diseased and nondiseased groups on the basis of the indicator variable. Within this setting, we will consider two information theoretic analyses of diagnostic information [1, 2]. The Kullback-Leibler divergence, also referred to as the Kullback-Leibler distance and the relative entropy, is a measure of the distance between two distributions [3]. Lee has described an application of the KullbackLeibler distance in characterising diagnostic tests [1]. In Lee’s application, the distributions in question describe the test outcomes of diseased and of non-diseased subjects. Lee described the Kullback-Leibler distance as “an abstract concept arising from statistics and information theory” [1]. Here, restricting our attention to binary diagnostic tests, we construct a diagrammatic interpretation of the Kullback-Leibler distance as used in Lee’s application. We refer to this an ‘information graph’, following Benish [4]. Information graphs provide a visual basis for the evaluation and comparison of binary diagnostic tests. This communication is motivated by the hope that a diagrammatic interpretation of the Kullback-Leibler distance will make it seem less of an abstract concept, and so make its application more accessible to epidemiologists and diagnosticians. Some analysis and notation is needed to describe the basis for our information graph and its correspondence with Lee’s analysis, although once that is done with, the resulting graph is straightforward to construct. As far as possible we will use Lee’s original notation. No re-interpretation of Lee’s analysis is required; the objective is solely to provide a new diagrammatic format to present such analysis. For a binary diagnostic test, the test outcomes of diseased and of non-diseased subjects are Bernoulli distributed. Generically, the Bernoulli distribution is a discrete probability distribution with two possible outcomes denoted xi Î {0, 1}, in which x = 1 occurs with probability θ1 and x = 0 occurs with probability θ2 = 1 − θ1. We write: θ ( x) = θ1 θ2 – x; x Î {0, 1}; θ1 + θ2 = 1; 0 < θ1 < 1. Then the Kullback-Leibler distances between two Bernoulli distributions f ( x ) = f1 f2 – x (describing test outcomes of diseased subjects) and g( x ) = g1 g2 – x (describing test outcomes of nondiseased subjects) as calculated by Lee are:
منابع مشابه
Information Measures via Copula Functions
In applications of differential geometry to problems of parametric inference, the notion of divergence is often used to measure the separation between two parametric densities. Among them, in this paper, we will verify measures such as Kullback-Leibler information, J-divergence, Hellinger distance, -Divergence, … and so on. Properties and results related to distance between probability d...
متن کاملModel Confidence Set Based on Kullback-Leibler Divergence Distance
Consider the problem of estimating true density, h(.) based upon a random sample X1,…, Xn. In general, h(.)is approximated using an appropriate in some sense, see below) model fƟ(x). This article using Vuong's (1989) test along with a collection of k(> 2) non-nested models constructs a set of appropriate models, say model confidence set, for unknown model h(.).Application of such confide...
متن کاملOn Conditional Applications of Matrix Variate Normal Distribution
In this paper, by conditioning on the matrix variate normal distribution (MVND) the construction of the matrix t-type family is considered, thus providing a new perspective of this family. Some important statistical characteristics are given. The presented t-type family is an extension to the work of Dickey [8]. A Bayes estimator for the column covariance matrix &Sigma of MVND is derived under ...
متن کاملSome statistical inferences on the upper record of Lomax distribution
In this paper, we investigate some inferential properties of the upper record Lomax distribution. Also, we will estimate the upper record of the Lomax distribution parameters using methods, Moment (MME), Maximum Likelihood (MLE), Kullback-Leibler Divergence of the Survival function (DLS) and Baysian. Finally, we will compare these methods using the Monte Carlo simulation.
متن کاملInformation Graphs for Epidemiological Applications of the Kullback-Leibler Divergence
Dear Editor, The topic addressed by this brief communication is the quantification of diagnostic information and, in particular, a method of illustrating graphically the quantification of diagnostic information for binary tests. To begin, consider the following outline procedure for development of such a test. An appropriate indicator variable that will serve as a proxy for the actual variable ...
متن کامل