Generalized Statistical Tests for mRNA and Protein Subcellular Spatial Patterning against Complete Spatial Randomness
نویسندگان
چکیده
We derive generalized estimators for a number of spatial statistics that have been used in the analysis of spatially resolved omics data, such as Ripleys K, H and L functions, clustering index, and degree of clustering, which allow these statistics to be calculated on data modelled by arbitrary random measures. Our estimators generalize those typically used to calculate these statistics on point process data, allowing them to be calculated on random measures which assign continuous values to spatial regions, for instance to model protein intensity. The clustering index (H∗) compares Ripleys H function calculated empirically to its distribution under complete spatial randomness (CSR), leading us to consider CSR null hypotheses for random measures which are not point-processes when generalizing this statistic. For this purpose, we consider restricted classes of completely random measures which can be simulated directly (Gamma processes and Marked Poisson Processes), as well as the general class of all CSR random measures, for which we derive an exact permutation-test based H∗ estimator. We establish several properties of the estimators we propose, including bounds on the accuracy of our general Ripley K estimator, its relationship to a previous estimator for the cross-correlation measure, and the relationship of our generalized H∗ estimator to a number of previous statistics. We test the ability of our approach to identify spatial patterning on synthetic and biological data. With respect to the latter, we demonstrate our approach on mixed omics data, by using Fluorescent In Situ Hybridization (FISH) and Immunofluorescence (IF) data to probe for mRNA and protein subcellular localization patterns respectively in polarizing mouse fibroblasts on micropattened cells. Using the generalized clustering index and degree of clustering statistics we propose, we observe correlated patterns of clustering over time for corresponding mRNAs and proteins, suggesting a deterministic effect of mRNA localization on protein localization for several pairs tested, including one case in which spatial patterning at the mRNA level has not been previously demonstrated.
منابع مشابه
TESTING FOR “RANDOMNESS” IN SPATIAL POINT PATTERNS, USING TEST STATISTICS BASED ON ONE-DIMENSIONAL INTER-EVENT DISTANCES
To test for “randomness” in spatial point patterns, we propose two test statistics that are obtained by “reducing” two-dimensional point patterns to the one-dimensional one. Also the exact and asymptotic distribution of these statistics are drawn.
متن کاملParameter Estimation in Spatial Generalized Linear Mixed Models with Skew Gaussian Random Effects using Laplace Approximation
Spatial generalized linear mixed models are used commonly for modelling non-Gaussian discrete spatial responses. We present an algorithm for parameter estimation of the models using Laplace approximation of likelihood function. In these models, the spatial correlation structure of data is carried out by random effects or latent variables. In most spatial analysis, it is assumed that rando...
متن کاملQuantifying spatial structure in experimental observations and agent-based simulations using pair-correlation functions.
We define a pair-correlation function that can be used to characterize spatiotemporal patterning in experimental images and snapshots from discrete simulations. Unlike previous pair-correlation functions, the pair-correlation functions developed here depend on the location and size of objects. The pair-correlation function can be used to indicate complete spatial randomness, aggregation, or seg...
متن کاملStructural analysis of sterol distributions in the plasma membrane of living cells.
Although plasma membrane (PM) cholesterol-rich and -poor domains have been isolated by subcellular fractionation, the real-time arrangement of cholesterol in such domains in living cells is still unclear. Therefore, dehydroergosterol (DHE), a naturally occurring fluorescent sterol, was incorporated into cultured L-cell fibroblasts. Two PM markers, the enhanced cyan fluorescent protein (ECFP-Mem...
متن کاملSpatial count models on the number of unhealthy days in Tehran
Spatial count data is usually found in most sciences such as environmental science, meteorology, geology and medicine. Spatial generalized linear models based on poisson (poisson-lognormal spatial model) and binomial (binomial-logitnormal spatial model) distributions are often used to analyze discrete count data in which spatial correlation is observed. The likelihood function of these models i...
متن کامل