Towards Spectral Sparsification of Simplicial Complexes Based on Generalized Effective Resistance
نویسندگان
چکیده
As a generalization of the use of graphs to describe pairwise interactions, simplicial complexes can be used to model higher-order interactions between three or more objects in complex systems. There has been a recent surge in activity for the development of data analysis methods applicable to simplicial complexes, including techniques based on computational topology, higher-order random processes, generalized Cheeger inequalities, isoperimetric inequalities, and spectral methods. In particular, spectral learning methods (e.g. label propagation and clustering) that directly operate on simplicial complexes represent a new direction emerging from the confluence of computational topology and machine learning. Similar to the challenges faced by massive graphs, computational methods that process simplicial complexes are severely limited by computational costs associated with massive datasets. To apply spectral methods in learning to massive datasets modeled as simplicial complexes, we work towards the sparsification of simplicial complexes based on preserving the spectrum of the associated Laplacian operators. We show that the theory of Spielman and Srivastava for the sparsification of graphs extends to simplicial complexes via the up Laplacian. In particular, we introduce a generalized effective resistance for simplexes; provide an algorithm for sparsifying simplicial complexes at a fixed dimension; and give a specific version of the generalized Cheeger inequality for weighted simplicial complexes. Finally, we demonstrate via experiments the preservation of the up Laplacian during sparsification, as well as the utility of sparsification with respect to spectral clustering. ∗E-mail: [email protected]. †E-mail: [email protected]. ‡E-mail: [email protected]. 1 Motivation Our work towards spectral sparsification of simplicial complexes is primarily motivated by learning based on simplicial complexes. Simplicial complexes capture higher-order interactions in complex systems; and there has been much recent activity in developing spectral theory for higher-order Laplacians as well as learning algorithms that operate on these Laplacians. We are interested in understanding the behavior of spectral learning algorithms on compact representations that preserve the spectral structure of data. Simplicial complexes in data analysis. Understanding massive systems with complex interactions and multi-scale dynamics is important in a variety of social, biological, and technological settings. One approach to understanding such systems is to represent them as graphs where vertices represent objects and (weighted) edges represent pairwise interactions between the objects. A large arsenal of methods has now been developed to analyze properties of graphs, which can then be combined with domain-specific knowledge to infer properties of the system being studied. These tools include graph partitioning and clustering [42, 56, 57], random processes on graphs [25], graph distances, various measures of graph connectivity [41], combinatorial graph invariants [16], and spectral graph theory [12]. While graphs have been used with great success to describe pairwise interactions between objects in datasets, they fail to capture higher-order interactions that occur between three or more objects. These structures in data can be described using simplicial complexes [27, 38]. There has recently been a surge in activity for the development of data analysis methods that focus on simplicial complexes, including methods based on computational topology [8, 20, 23, 27], higher-order random processes [5, 26], generalized Cheeger inequalities [28, 54], isoperimetric inequalities [45], high-dimensional expanders [17, 35, 44], and spectral methods [29]. In particular, topological data analysis methods using simplicial complexes as the underlying combinatorial structures have been successfully employed for applications as diverse as the discovery of a new subtype of breast cancer [40], describing high-contrast patches in images [34], time series analysis [46], multi-channel communication [47] and sensor networks [13], statistical ranking [31, 43], and visualization [58]. Simplicial complexes have also been used to generalize graphical models in machine learning, where faces of dimension two or higher represent higher-order conditional dependence relations between random variables [19]. Sparsification of simplicial complexes. For unstructured graphs representing massive datasets, the computational costs associated with näıve implementations of many graph-based tools is prohibitive. In this scenario, it is useful to approximate the original graph with one having fewer edges or vertices while preserving some properties of interest in some appropriate metric, known as graph sparsification. A variety of graph sparsification methods have been developed, allowing for their efficient storage and computation [3, 50, 51]. Our work is inspired by and based on the seminal work by Spielman and Srivastava [50]. Similar to the challenges faced by massive graphs, computational methods that operate on simplicial complexes are severely limited by computational costs associated with massive datasets. There have been recent approaches from computational topology to construct sparse simplicial complexes that give good approximation results for computing persistent homology [6, 7, 9, 11, 14, 15, 33, 49, 55]. Persistence homology [21] turns the algebraic concept of homology into a multiscale notion. It typically operates on a sequence of simplicial complexes (referred to as a filtration), constructs a series of homology groups and measures their relevant scales in the filtration. Common simplicial filtrations arise from Čech or (Vietoris-)Rips complexes, and most of the aforementioned 1 techniques produce sparsified Rips complexes that give guaranteed approximations to the persistent homology of the unsparsified filtration. The sparsification processes involve either the removal or subsampling of vertices, or edge contractions from the sparse filtration. While these techniques focus on the preservation of homological features in the data (referred to as homological sparsification), to the best of our knowledge, there are no known results regarding the higher-order spectral properties of the sparsified simplicial complexes. Our approach significantly differs from these techniques in the sense that it focuses on spectral sparsification which preserves higher-order spectral properties of the simplicial complexes, instead of homological ones. Spectral sparsification and machine learning. A recurring theme in machine learning focuses on graph-based learning, where the data assume an underlying graph structure and one would like to infer information about the nodes of the graph. Spectral methods for graph-based learning such as spectral clustering (e.g. [1, 52]) are essential to many seminal papers in the field, which typically have good theoretical guarantees and efficient solutions for problems ranging from image segmentation [36] to community detection [2]. Some important instances of semi-supervised graphbased learning algorithms are often referred to as label propagation methods [30, 60] where labels on the nodes are propagated along the edges of the graph. Inspired by graph-based learning, a setting which can be used to describe complex data, learning (indirectly or directly) based on simplicial complexes represents a new direction recently emerging from the confluence of computational topology and machine learning. On one hand, topological features derived from simplicial complexes, used as input to machine learning algorithms, have shown to increase the strength of prediction or classification compared to graph-theoretic features [4, 59]. On the other hand, we would like to develop learning algorithms that directly operate on simplicial complexes. For example, researchers have begun to develop mathematical intuition behind higher-dimensional notions of spectral clustering and label propagation [37, 54, 57]. Our work on spectral sparsification of simplicial complexes would play an important role in applying spectral learning algorithms to massive datasets modeled as simplicial complexes. Overview. In this paper, we work towards developing computational methods for the spectral sparsification of simplicial complexes, in particular: • We introduce a generalized effective resistance of simplexes by extending the notion of effective resistance of edges (e.g. [10, 18, 22]); see Section 3. • We show that the algorithm in [50] for sparsifying graphs can be generalized to the simplification of simiplicial complexes at a fixed dimension, and prove that the spectrum of the up Laplacian is preserved under sparsification in the sense that the spectrum of the up Laplacian for the sparsified simplicial complex can be bounded in terms of the spectrum of the up Laplacian for the original simplicial complex; see Theorem 3.1. • We generalize the Cheeger constant of Gundert and Szedlák for unweighted simplicial complexes [28] to weighted simplicial complexes and verify that the Cheeger inequality involving the first non-trivial eigenvalue of the weighted up Laplacian holds in the sparsfied setting; see Proposition 4.1. • Our theoretical results are supported by numerical experiments in Section 5. These experiments illustrate the inequalities bounding the spectrum of the up Laplacian of the sparsified simplicial complex, proven in Theorem 3.1. Further experiments demonstrate that simplicial complex clusters, obtained via a method extending spectral clustering to simplicial complexes, 2 are preserved by spectral sparsification. This application exemplifies the utility of the spectral sparsification methods developed here. We proceed by reviewing background results and introducing notation in Section 2 that gives a brief description of relevant algebraic concepts, effective resistance, and spectral sparsification of graphs. The theory and algorithm for sparsifying simplicial complexes are provided in Section 3. We state the implications of the algorithm for a generalized Cheeger cut for the simplicial complex in Section 4. We showcase some experimental results validating our algorithms in Section 5. We conclude with some open questions in Section 6. 2 Background Simplicial complexes. A simplicial complex K is a finite collection of simplices such that every face of a simplex of K is in K and the intersection of any two simplexes of K is a face of each of them [38]. 0-, 1and 2-simplices correspond to vertices, edges and triangles. An oriented simplex is a simplex with a chosen ordering of its vertices. Let Sp(K) denote the collection of all oriented psimplices of K and np = |Sp(K)|. The p-skeleton of K is denoted as K(p) := ∪0≤i≤pSi(K). For the remainder of this paper, let K be an oriented simplicial complex on a vertex set [n] = {1, 2, . . . , n}. Let dimK denote the dimension of K. For a review of simplicial complexes, see [24, 27, 38]. Laplace operators on simplicial complexes. The i-th chain group Ci(K) = Ci(K,R) of a complex K with coefficient R is a vector space over the field R with basis in Si(K). The i-th cochain group Ci(K) = Ci(K,R) is the dual of the chain group, defined by Ci(K) := Hom(Ci(K),R), where Hom(Ci,R) denotes all homomorphisms of Ci into R. The coboundary operator, δi : Ci(K) → Ci+1(K), is defined as (δif)([v0, . . . , vi+1]) = i+1 ∑ j=1 (−1)f([v0, . . . , v̂j , . . . , vi+1]), where v̂j denotes that the vertex vj has been omitted. It satisfies the property δiδi−1 = 0 which implies that im(δi−1) ⊂ ker(δi). The boundary operators, δ∗ i , are the adjoints of the coboundary operators, · · · C(K) δi δ∗ i C(K) δi−1 δ∗ i−1 Ci−1(K) · · · satisfying (δia, b)Ci+1 = (a, δ ∗ i b)Ci for every a ∈ Ci(K) and b ∈ Ci+1(K), where (·, ·)Ci denote the scalar product on the cochain group. Following [29], we define three combinatorial Laplace operators that operate on Ci(K) (for the i-th dimension). Namely, the up Laplacian, L i (K) = δ∗ i δi, the down Laplacian, Ldown i (K) = δi−1δ i−1, and the Laplacian, Li(K) = L i (K) + Ldown i (K). All three operators are self-adjoint, non-negative, compact and enjoy a collection of spectral properties, as detailed in [29]. We restrict our attention to the up Laplacians.
منابع مشابه
Towards a Spectral Theory for Simplicial Complexes
Towards a Spectral Theory for Simplicial Complexes
متن کاملOn a special class of Stanley-Reisner ideals
For an $n$-gon with vertices at points $1,2,cdots,n$, the Betti numbers of its suspension, the simplicial complex that involves two more vertices $n+1$ and $n+2$, is known. In this paper, with a constructive and simple proof, wegeneralize this result to find the minimal free resolution and Betti numbers of the $S$-module $S/I$ where $S=K[x_{1},cdots, x_{n}]$ and $I$ is the associated ideal to ...
متن کاملCohen-Macaulay-ness in codimension for simplicial complexes and expansion functor
In this paper we show that expansion of a Buchsbaum simplicial complex is $CM_t$, for an optimal integer $tgeq 1$. Also, by imposing extra assumptions on a $CM_t$ simplicial complex, we provethat it can be obtained from a Buchsbaum complex.
متن کاملVertex Decomposable Simplicial Complexes Associated to Path Graphs
Introduction Vertex decomposability of a simplicial complex is a combinatorial topological concept which is related to the algebraic properties of the Stanley-Reisner ring of the simplicial complex. This notion was first defined by Provan and Billera in 1980 for k-decomposable pure complexes which is known as vertex decomposable when . Later Bjorner and Wachs extended this concept to non-pure ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1708.08436 شماره
صفحات -
تاریخ انتشار 2017