Multilingual Knowledge Graph Embeddings for Cross-lingual Knowledge Alignment

نویسندگان

  • Muhao Chen
  • Yingtao Tian
  • Mohan Yang
  • Carlo Zaniolo
چکیده

Many recent works have demonstrated the benefits of knowledge graph embeddings in completing monolingual knowledge graphs. Inasmuch as related knowledge bases are built in several different languages, achieving cross-lingual knowledge alignment will help people in constructing a coherent knowledge base, and assist machines in dealing with different expressions of entity relationships across diverse human languages. Unfortunately, achieving this highly desirable crosslingual alignment by human labor is very costly and error-prone. Thus, we propose MTransE, a translation-based model for multilingual knowledge graph embeddings, to provide a simple and automated solution. By encoding entities and relations of each language in a separated embedding space, MTransE provides transitions for each embedding vector to its cross-lingual counterparts in other spaces, while preserving the functionalities of monolingual embeddings. We deploy three different techniques to represent cross-lingual transitions, namely axis calibration, translation vectors, and linear transformations, and derive five variants for MTransE using different loss functions. Our models can be trained on partially aligned graphs, where just a small portion of triples are aligned with their cross-lingual counterparts. The experiments on cross-lingual entity matching and triple-wise alignment verification show promising results, with some variants consistently outperforming others on different tasks. We also explore how MTransE preserves the key properties of its monolingual counterpart TransE.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ConceptNet at SemEval-2017 Task 2: Extending Word Embeddings with Multilingual Relational Knowledge

This paper describes Luminoso’s participation in SemEval 2017 Task 2, “Multilingual and Cross-lingual Semantic Word Similarity”, with a system based on ConceptNet. ConceptNet is an open, multilingual knowledge graph that focuses on general knowledge that relates the meanings of words and phrases. Our submission to SemEval was an update of previous work that builds high-quality, multilingual wor...

متن کامل

Improving Neural Knowledge Base Completion with Cross-Lingual Projections

In this paper we present a cross-lingual extension of a neural tensor network model for knowledge base completion. We exploit multilingual synsets from BabelNet to translate English triples to other languages and then augment the reference knowledge base with cross-lingual triples. We project monolingual embeddings of different languages to a shared multilingual space and use them for network i...

متن کامل

Bilingual emb e ddings with random walks over multilingual wordnets

Bilingual word embeddings represent words of two languages in the same space, and allow to transfer knowledge from one language to the other without machine translation. The main approach is to train monolingual embeddings first and then map them using bilingual dictionaries. In this work, we present a novel method to learn bilingual embeddings based on multilingual knowledge bases (KB) such as...

متن کامل

Learning Knowledge Graph Embeddings for Natural Language Processing

Knowledge graph embeddings provide powerful latent semantic representation for the structured knowledge in knowledge graphs, which have been introduced recently. Being different from the already widely-used word embeddings that are conceived from plain text, knowledge graph embeddings enable direct explicit relational inferences among entities via simple calculation of embedding vectors. In par...

متن کامل

Cross-Lingual Word Representations via Spectral Graph Embeddings

Cross-lingual word embeddings are used for cross-lingual information retrieval or domain adaptations. In this paper, we extend Eigenwords, spectral monolingual word embeddings based on canonical correlation analysis (CCA), to crosslingual settings with sentence-alignment. For incorporating cross-lingual information, CCA is replaced with its generalization based on the spectral graph embeddings....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017