s-CorrPlot: An Interactive Scatterplot for Exploring Correlation
نویسندگان
چکیده
The degree of correlation between variables is used in many data analysis applications as a key measure of interdependence. The most common techniques for exploratory analysis of pairwise correlation in multivariate datasets, like scatterplot matrices and clustered heatmaps, however, do not scale well to large datasets, either computationally or visually. We present a new visualization that is capable of encoding pairwise correlation between hundreds of thousands variables, called the s-CorrPlot. The s-CorrPlot encodes correlation spatially between variables as points on scatterplot using the geometric structure underlying Pearson’s correlation. Furthermore, we extend the s-CorrPlot with interactive techniques that enable animation of the scatterplot to new projections of the correlation space, as illustrated in the companion video in Supplemental Materials. We provide the s-CorrPlot as an open-source R-package and validate its effectiveness through a variety of methods including a case study with a biology collaborator.
منابع مشابه
s-CorrPlot: Encoding and Exploring Correlation
Figure 1: Visualizations of correlation for a dataset containing 22,000 variables. The left two images show the correlation coefficients using a heatmap, clustered with (a) average linkage, and (b) complete linkage. The visible patterns in the heatmap are highly dependent on the clustering algorithm. In (c), our novel s-CorrPlot spatially encodes correlation coefficients, highlighting very diff...
متن کاملInteractive Methods for Exploring Particle Simulation Data
In this work, we visualize high-dimensional particle simulation data using a suite of scatterplot-based visualizations coupled with interactive selection tools. We use traditional 2D and 3D projection scatterplots as well as a novel oriented-disk rendering style to convey various information about the data. Interactive selection tools allow physicists to manually classify “interesting” sets of ...
متن کاملLaw Enforcement Resource Allocation (LERA) System
LERA, the information visualization system presented in this paper, is an interactive, scatterplot visualization system that was designed to support crime analysts in exploring the effects of various law enforcement administration programs and policies on crime rates. Several important features were incorporated in our system to meet this goal. In particular LERA provides the user with the abil...
متن کاملTasks to Tease Apart Scatterplot Design Decisions
Design Decisions Procedure Scatterplots are among the most common methods for exploring and presenting data, covering a wide range of tasks and designs. The variety of scatterplot designs has created a proliferation of potential design decisions to consider when constructing a scatterplot. However, there remain many unexamined assumptions in respect to the trade-offs between these decisions. He...
متن کاملQuality Metrics Driven Approach to Visualize Multidimensional Data in Scatterplot Matrix
Extracting meaningful information out of vast amounts of high-dimensional data is challenging. Prior research studies have been trying to solve these problems through either automatic data analysis or interactive visualization approaches. Our grand goal is to derive representative and generalizable quality metrics and to apply these to amplify interesting patterns as well as to mute the uninter...
متن کامل