نتایج جستجو برای: wikipedia mining

تعداد نتایج: 92181  

2017
Abbas Ghaddar Philippe Langlais

We revisit the idea of mining Wikipedia in order to generate named-entity annotations. We propose a new methodology that we applied to the English Wikipedia to build WiNER, a large, high quality, annotated corpus. We evaluate its usefulness on 6 NER tasks, comparing 4 popular state-of-the art approaches. We show that LSTM-CRF is the approach that benefits the most from our corpus. We report imp...

2010
Minh Nghiem Quoc Keisuke Yokoi Yuichiroh Matsubayashi Akiko Aizawa

In this paper, we address the problem of discovering coreference relations between formulas and the surrounding text. The task is different from traditional coreference resolution because of the unique structure of the formulas. In this paper, we present an approach, which we call ‘CDF (Concept Description Formula)’, for mining coreference relations between formulas and the concepts that refer ...

2006
David Ahn Steven Schockaert Martine de Cock Etienne Kerre

We pursue two strategies for offline data collection for a temporal question answering system that uses both quantitative methods and fuzzy methods to reason about time and events. The first strategy extracts event descriptions from the structured year entries in the online encyclopedia Wikipedia, yielding clean quantitative temporal information about a range of events. The second strategy mine...

2018
Samer El Zant Katia Jaffres-Runser Dima Shepelyansky

Interactions between countries originate from diverse aspects such as geographic proximity, trade, socio-cultural habits, language, religions, etc. Geopolitics studies the influence of a country’s geographic space on its political power and its relationships with other countries. This work reveals the potential of Wikipedia mining for geopolitical study. Actually, Wikipedia offers solid knowled...

2012
Maria Teresa Pazienza

The collection of the specialized vocabulary of a particular domain (terminology) is an important initial step of creating formalized domain knowledge representations (ontologies). Terminology Extraction (TE) aims at automating this process by collecting the relevant domain vocabulary from existing lexical resources or collections of domain texts. In this chapter, the authors address the extrac...

2008
Kotaro Nakayama Takahiro Hara Shojiro Nishio

Wikipedia has become a huge phenomenon on the WWW. As a corpus for knowledge extraction, it has various impressive characteristics such as a huge amount of articles, live updates, a dense link structure, brief link texts and URL identification for concepts. In our previous work, we proposed link structure mining algorithms to extract a huge scale and accurate association thesaurus from Wikipedi...

2009
Yulan Yan Yutaka Matsuo Mitsuru Ishizuka

Linguistic-based methods and web mining-based methods are two types of leading methods for semantic relation extraction task. By integrating linguistic analysis with frequent Web information, this paper presents an unsupervised relation extraction approach, for discovering and enhancing relations in which a specified concept participates. We focus on concepts described in Wikipedia articles. By...

2015
May Sabai Han

Information retrieval is used to find a subset of relevant documents against a set of documents. Determining semantic similarity between two terms is a crucial problem in Web Mining for such applications as information retrieval systems and recommender systems. Semantic similarity refers to the sameness of two terms based on sameness of their meaning or their semantic contents. Recently many te...

2015
Yu Zhao Zhiyuan Liu Maosong Sun

Incorporating multiple types of relational information from heterogeneous networks has been proved effective in data mining. Although Wikipedia is one of the most famous heterogeneous network, previous works of semantic analysis on Wikipedia are mostly limited on single type of relations. In this paper, we aim at incorporating multiple types of relations to measure the semantic relatedness betw...

2014
Paul Laufer Claudia Wagner Fabian Flock Markus Strohmaier

For many people, Wikipedia represents one of the primary sources of knowledge about foreign cultures. Yet, different Wikipedia language editions offer different descriptions of cultural practices. Unveiling diverging representations of cultures provides an important insight, since they may foster the formation of cross-cultural stereotypes, misunderstandings and potentially even conflict. In th...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید