A Conceptual Graph Based Approach to Ontology Similarity Measure
نویسندگان
چکیده
This paper presents a combinatorial, structure based approach to the problem of finding a (di)similarity measure between two Conceptual Graphs. With a growing number of ontologies and an increasing need for quick, on the fly knowledge integration and querying, ontology similarity measures are essential for building the foundations of the Semantic Web. Conceptual Graphs benefit from a graph based representation that can be exploited in versatile optimisation techniques. We propose a disimilarity measure based on the content and the structure of two graphs. This disimilarity measure is based on the clique number of the matching graph, a combinatorial structure which encodes the two graphs projection information. 1 Motivations and Rationale In this paper we present a structural, Conceptual Graph [11] based approach to the problem of finding ontology (di)simmilarity measures. Conceptual Graphs are a visual, logic based knowledge representation formalism. The ontological knowledge is represented in the support which is a poset of concept and relation hierarchies. The factual knowledge is represented in a bipartite graph where the two classes of partition contain concept and relation nodes from the support. We propose a (di)similarity measure in between two Conceptual Graphs which considerers the inherent structural properties of the two graphs. More precisely, by considering both relation adjacency in the bipartite graph and the relation hierarchy in the support, we devise a combinatorial structure, the matching graph, which can then be used to deduce an interesting (di)similarity value. Finding a (di)similarity measure between two Conceptual Graphs is a important problem in an information era where more and more ontologies are employed for powerful applications (e.g. The Semantic Web [2]). Conceptual Graphs benefit from an easy plug in capabilities making it easy to employ and extend existing ontologies. For intelligent knowledge based applications it is important to be able to compare the represented knowledge by providing a “meaningful” (di)similarity measure. The aim of this paper is to theoretically lay the foundations for such (di)similarity measure between two Conceptual Graphs. Future work aims at translating RDF [12] / OWL DL [13] ontologies into Conceptual Graphs and showing how the structural properties of this (di)similarity measure add extra benefit. U. Priss, S. Polovina, and R. Hill (Eds.): ICCS 2007, LNAI 4604, pp. 154–164, 2007. c © Springer-Verlag Berlin Heidelberg 2007 A Conceptual Graph Based Approach to Ontology Similarity Measure 155 Since a large benchmark of Conceptual Graphs is still under development, our work is theoretical and evaluated by its own novel approach to a reasoning founded (di)similarity measure. Indeed, existing work on Conceptual Graph comparison (for example [8], [10], [9]) does not address the interesting structural properties that arise from two neighborhood relation nodes and the inherent combinatorial properties the projection raises on such structural features. This paper is structured as follows. In Section 2 we formally introduce Conceptual Graphs and the projection operator. The aim of this section is to present rigurous definitions that allow future sections to rigurously present our approach to projection checking and subsequently (di)simmilarity measures. Section 3 presents the Matching Graph, a combinatorial structure for projection checking. This structure exploits both the structural interdependencies between the relation nodes in the bipartite graph and the relation type hierarchy in the support. This, and the fact that projection as a reasoning mechanism in itself aims at knowledge comparison is the motivation of this work. Indeed, our claim is that not only one should consider semantically sound transformation for Conceptual Graph comparison but also the graph structure of the factual knowledge. We believe that the Matching Graph, by its definition, is an effective tool to address this claim. The section finishes by presenting further optimizations which are exploited to construct the Reduced Matching Graph. As mentioned before, we employ projection checking as the foundation of our approach since focusing on reasoning tools for Conceptual Graphs implicitly addresses knowledge comparison problems. This claim will allow us to define a structurally rigorous (di)similarity measure in Section 4 based on the Reduced Matching Graphs clique number of the two graphs to be compared. Section 5 concludes the paper. 2 Conceptual Graphs Some of the definitions in this section follow the work of [3]. Background knowledge for Conceptual Graphs is encoded in a structure called support which consists of a concept type hierarchy, a relation type hierarchy, a set of individual markers that refer to specific concepts and a generic marker, denoted by *, which refers to an unspecified concept. Definition 1 (Support). A support is a 4-tuple S = (TC , TR, I, ∗) where: – TC is a finite, partially ordered set (poset) of concept types (TC ,≤) that defines a type hierarchy where ∀x, y ∈ TC, x ≤ y means that x is a subtype of y. The top element of this hierarchy is the universal type C . – TR is a finite set of relation types partitioned into k posets (T i R,≤)i=1,k of relation types of arity i (1 ≤ i ≤ k), where k is the maximum arity of a relation type in TR. Each relation type of arity i, namely r ∈ T i R, has an associated signature σ(r) ∈ TC × . . .× TC } {{ } i times , which specifies the maximum concept type of each of its arguments. This means that if we use r(x1, . . . , xi), then xj is a concept of type(xj) ≤ σ(r)j (1 ≤ j ≤ i). The partial orders 156 M. Croitoru et al. on relation types of the same arity must be signature-compatible, i.e. ∀r1, r2 ∈ T i R r1 ≤ r2 ⇒ σ(r1) ≤ σ(r2). – I is a countable set of individual markers. – ∗ is the generic marker that refers to an unspecified concept. – The sets TC, TR, I and {∗} are mutually disjoint. – I ∪ {∗} is partially ordered by x ≤ y if and only if x = y or y = ∗. A conceptual graph is a structure that depicts factual information about the background knowledge contained in its support. This information is presented in a visual manner as an ordered bipartite graph, whose nodes have been labelled with elements from the support. Definition 2. A simple conceptual graph (SCG) is a 3-tuple SG = [S,G, λ], where: – S = (TC , TR, I, ∗) is a support; – G = (VC , VR;EG, l) is an ordered bipartite graph; – λ is a labelling of the nodes of G with elements from the support S: ∀r ∈ VR, λ(r) ∈ T dG(r) R ; ∀c ∈ VC , λ(c) ∈ TC × (I ∪ {∗}) such that if c = N i G(r), λ(r) = tr and λ(c) = (tc, refc) then tc ≤ σi(r). Conceptual graphs represent knowledge at a syntactic level. Projection (subsumption) a labelled graph homomorphism is the main tool for reasoning with SCGs. This is done by preserving the order of the neighbors in the two graphs and comparing the types and labels of the nodes / relations. Projection corresponds to deduction for the existential conjunctive and positive fragment of first order logic ([4]). In the following, when the support is implicit we will just use a tuple (G, λG for denoting a simple conceptual graph SG. Definition 3 (Projection). If SG = (G, λG) and SH = (H,λH) are two simple conceptual graphs defined on the same support S, then a projection from SG to SH is a mapping π : VC(G) ∪ VR(G) → VC(H) ∪ VR(H), such that: – π(VC(G)) ⊆ VC(H) and π(VR(G)) ⊆ VR(H); – ∀c ∈ VC(G) and ∀r ∈ VR(G), if c = N i G(r) then π(c) = N i H(π(r)); – ∀v ∈ VC(G) ∪ VR(G), λG(v) ≥ λH(π(v)). The order on λ in the above definition preserves the order on TC (TR) and considers the elements of I mutually incomparable (as previously defined). If there is a projection from SG to SH (that is ΠG→H = ∅), then SG subsumes SH (denoted as SG ≥ SH). The subsumption relation is a pre-order on the set of all Simple Conceptual Graphs defined on the same support. This is the starting point of our research that made us look at projection optimisation techniques for implicitly finding Conceptual Graphs (di)similarity measures. A Conceptual Graph Based Approach to Ontology Similarity Measure 157 3 Matching Graph Let us consider two simple conceptual graphs, G and H , for which we want to test if G ≥ H holds. For each relation node r of the graph G let us consider the set Canditates0(r). This is the set of all relation nodes of H in which r can be individually projected. The only criteria for the nodes in Canditates0(r) is to be type compatible with r. However, the projections candidates for two relation nodes r and r′ have to be compatible also from the shared neighbor concept nodes view point. More precisely, if s ∈ Candidates0(r) then the pair (r, s) is an individual projection. Two individual projections (r, s) and (r′, s′) are compatible if the common neighbors of r and r′ in G are preserved (in the right order) by s and s′ in H . If we define a graph having individual projection as nodes and the edges given by the nodes compatibility, then a clique in this graph will ensure the overall compatibility of its members. Therefore we translate the projection checking of the two SCGs into finding a clique whose cardinality equals the number of relation nodes of the first graph [5]. Definition 4. Let SG = (G, λG) and SH = (H,λH) be two SCGs with no isolated concept vertices defined on the same support S. The matching graph of SG and SH is defined as the graph MG→H = (V,E) where: – V ⊆ VR(G) × VR(H) is the set of all pairs (r, s) such that r ∈ VR(G), s ∈ VR(H), λG(r) ≥ λH(s); ∀i ∈ {1, . . . , dG(r)}, λG(N i G(r)) ≥ λH(N i H(s)) and ∀i, j ∈ {1, . . . , dG(r)} if N i G(r) = N j G(r) then N i G(s) = N j G(s). – E is the set of all 2-sets {(r, s), (r′, s′)}, where r = r′, (r, s), (r′, s′) ∈ V and N i H(s) = N j H(s ′)∀i ∈ {1, . . . , dG(r)}, ∀j ∈ {1, . . . , dG(r′)} such that N i G(r) = N j G(r ′). A clique in a graph F is a set of mutually adjacent vertices. The maximum cardinality of a clique in F is denoted as ω(F ). The theorem below shows that if SG and SH are two simple conceptual graphs without isolated nodes, then the problem of finding a projection from SG to SH is equivalent to finding a maximum cardinality clique in MG→H . Theorem 1. Let SG = (G, λG) and SH = (H,λH) be two simple conceptual graphs without isolated concept vertices defined on the same support S and let MG→H = (V,E) be their matching graph. There is a projection from SG to SH if and only if ω(MG→H) = |VR(G)|. For a proof of this Theorem see [6]. Let us consider the two Simple Conceptual Graphs depicted in Figure 1 and let us assume that the relation nodes types r and s are comparable: r > s. Figure 2 depicts the matching graph associated to the graphs G and H . The columns associated to the relation nodes x and y represent their Candidates sets. Each candidate is a node in the matching graph. An edge is drawn if the two nodes are compatible from a neighbor concept nodes view point. 158 M. Croitoru et al.
منابع مشابه
خوشهبندی اسناد مبتنی بر آنتولوژی و رویکرد فازی
Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...
متن کاملSimilarity From Conceptual Relations
The main focus of this paper is how to measure similarity in a content-based information retrieval environment. In the first part we define the information base, which is a generative framework where an ontology in combination with a concept language defines a set of well-formed concepts. Well-formed concepts is assumed to be the basis for an indexing of the information base in the sense that t...
متن کاملExploiting conceptual spaces for ontology integration
The widespread use of ontologies raises the need to integrate distinct conceptualisations. Whereas the symbolic approach of established representation standards – based on first-order logic (FOL) and syllogistic reasoning – does not implicitly represent semantic similarities, ontology mapping addresses this problem by aiming at establishing formal relations between a set of knowledge entities w...
متن کاملAn Executive Approach Based On the Production of Fuzzy Ontology Using the Semantic Web Rule Language Method (SWRL)
Today, the need to deal with ambiguous information in semantic web languages is increasing. Ontology is an important part of the W3C standards for the semantic web, used to define a conceptual standard vocabulary for the exchange of data between systems, the provision of reusable databases, and the facilitation of collaboration across multiple systems. However, classical ontology is not enough ...
متن کاملConceptual Graph Matching for Semantic Search1
Semantic search becomes a research hotspot. The combined use of linguistic ontologies and structured semantic matching is one of the promising ways to improve both recall and precision. In this paper, we propose an approach for semantic search by matching conceptual graphs. The detailed definitions of semantic similarities between concepts, relations and conceptual graphs are given. According t...
متن کامل