Towards Knowledge-enriched Cross-Lingual Answer Validation
نویسندگان
چکیده
Our baseline approach from the 2012 year includes three language-independent methods for the task of answer validation. All methods are based on a scoring mechanism that reflects the degree of similarity between the question-answer pairs and the supporting text. We evaluate the proposed methods when using various string similarity metrics, such as exact matching, Levenshtein, Jaro and Jaro-Winkler. In addition to this baseline approach, we take advantage of the multilingual QA4MRE dataset, and devise an ensemble method, which chooses the answer indicated as correct by the largest number of analyses of the individual translations. Finally, we present a language-augmented method that enriches the questions and answers with paraphrases obtained by means of machine translation. We show that all of the described approaches achieve a significant improvement over the random baseline, and that both majority voting and language augmentation lead to superior accuracy as compared with the original method. However, the addition of some knowledge-based components in year 2013 plus the complexity of the datasets led to decrease in overall accuracy for Bulgarian language.
منابع مشابه
Cross-Lingual Question Answering Using Common Semantic Space
With the advent of Big Data concept, a lot of attention has been paid to structuring and giving semantic to this data. Knowledge bases like DBPedia play an important role to achieve this goal. Question answering systems are common approach to address expressivity and usability of information extraction from knowledge bases. Recent researches focused only on monolingual QA systems while cross-li...
متن کاملBoosting Cross-Lingual Knowledge Linking via Concept Annotation
Automatically discovering cross-lingual links (CLs) between wikis can largely enrich the cross-lingual knowledge and facilitate knowledge sharing across different languages. In most existing approaches for cross-lingual knowledge linking, the seed CLs and the inner link structures are two important factors for finding new CLs. When there are insufficient seed CLs and inner links, discovering ne...
متن کاملCross-Lingual Knowledge Validation Based Taxonomy Derivation from Heterogeneous Online Wikis
Creating knowledge bases based on the crowd-sourced wikis, like Wikipedia, has attracted significant research interest in the field of intelligent Web. However, the derived taxonomies usually contain many mistakenly imported taxonomic relations due to the difference between the user-generated subsumption relations and the semantic taxonomic relations. Current approaches to solving the problem s...
متن کاملAn evaluation framework for cross-lingual link discovery
Cross-Lingual Link Discovery (CLLD) is a new problem in Information Retrieval. The aim is to automatically identify meaningful and relevant hypertext links between documents in different languages. This is particularly helpful in knowledge discovery if a multi-lingual knowledge base is sparse in one language or another, or the topical coverage in each language is different; such is the case wit...
متن کاملAnalysis and Refinement of Cross-Lingual Entity Linking
In this paper we propose two novel approaches to enhance cross-lingual entity linking (CLEL). One is based on cross-lingual information networks, aligned based on monolingual information extraction, and the other uses topic modeling to ensure global consistency. We enhance a strong baseline system derived from a combination of state-of-the-art machine translation and monolingual entity linking ...
متن کامل