English-to-Traditional Chinese Cross-lingual Link Discovery in Articles with Wikipedia Corpus
نویسندگان
چکیده
In this paper, we design a processing flow to produce linked data in articles, providing anchorbased term’s additional information and related terms in different languages (English to Chinese). Wikipedia has been a very important corpus and knowledge bank. Although Wikipedia describes itself not a dictionary or encyclopedia, it is if high potential values in applications and data mining researches. Link discovery is a useful IR application, based on Data Mining and NLP algorithms and has been used in several fields. According to the results of our experiment, this method does make the result has improved.
منابع مشابه
Cross-lingual Link Discovery Based on CRF Model for NTCIR-10 CrossLink
This paper described our participation in the NTCIR-10 Cross-lingual Link Discovery Task of Chinese-to-English(C2E). The task focuses on making sutiable links on terms between Chinese/Japanese/Korean lingual Wikipedia articles and English Wikipedia articles. In this event, we proposed a method on Chinese-to-English subtask. The method that we proposed have two stage. We divides this task into “...
متن کاملCross-Lingual Link Discovery between Chinese and English Wiki Knowledge Bases
Wikipedia is an online multilingual encyclopedia that contains a very large number of articles covering most written languages. However, one critical issue for Wikipedia is that the pages in different languages are rarely linked except for the cross-lingual link between pages about the same subject. This could pose serious difficulties to humans and machines who try to seek information from dif...
متن کاملCross-Lingual Knowledge Discovery: Chinese-to-English Article Linking in Wikipedia
In this paper we examine automated Chinese to English link discovery in Wikipedia and the effects of Chinese segmentation and Chinese to English translation on the hyperlink recommendation. Our experimental results show that the implemented link discovery framework can effectively recommend Chinese-toEnglish cross-lingual links. The techniques described here can assist bi-lingual users where a ...
متن کاملAutomated Cross-lingual Link Discovery in Wikipedia
At NTCIR-9, we participated in the cross-lingual link discovery (Crosslink) task. In this paper we describe our approaches to discovering Chinese, Japanese, and Korean (CJK) cross-lingual links for English documents in Wikipedia. Our experimental results show that a link mining approach that mines the existing link structure for anchor probabilities and relies on the “translation” using cross-l...
متن کاملNTHU at NTCIR-10 CrossLink-2: An Approach toward Semantic Features
This paper describes the approaches of NTHU in the NTCIR-10 Cross-Lingual Link Discovery task, also named CrossLink-2. In this task, we aim to discover valuable anchors in Chinese, Japanese or Korean (CJK) articles and to link these anchors to related English Wikipedia pages. To achieve the objective, we do not only depend on Wikipedia’s distinguishing features (e.g. anchor links information an...
متن کامل