نتایج جستجو برای: wikipedia mining

تعداد نتایج: 92181  

2006
Tien Tran Richi Nayak

This paper reports the results and experiments performed on the INEX 2006 Document Mining Challenge Corpus with the PCXSS clustering method. The PCXSS method is a progressive clustering method that computes the similarity between a new XML document and existing clusters by considering the structures within documents. We conducted the clustering task on the INEX and Wikipedia data sets.

2008

Wikipedia has been applied as a background knowledge base to various text mining problems, including document categorization, topic indexing and information extraction. However, very few attempts have been made to utilize it for document clustering. In this paper we propose to exploit Wikipedia and the semantic knowledge therein to facilitate clustering, enabling the automatic grouping of docum...

2014
Miao Fan Qiang Zhou Thomas Fang Zheng

This paper focuses on an emerging research topic about mining microbloggers’ personalized interest tags from their own microblogs ever posted. It based on an intuition that microblogs indicate the daily interests and concerns of microblogs. Previous studies regarded the microblogs posted by one microblogger as a whole document and adopted traditional keyword extraction approaches to select high...

2016
Anna Lisa Gentile Sabrina Kirstein Heiko Paulheim Christian Bizer

Analysts are increasingly confronted with the situation that data which they need for a data mining project exists somewhere on the Web or in an organization’s intranet but they are unable to find it. The data mining tools that are currently available on the market offer a wide range of powerful data mining methods but hardly support analysts in searching for suitable data as well as in integra...

2010
Yulan Yan

With the advent of the Web and the explosion of available textual data, interest in techniques for machines to understand unstructured text has been growing. Recent attention to map textual content into a structured knowledge base through automatically harvesting semantic relations from unstructured text has encouraged Data Mining and Natural Language Processing researchers to develop algorithm...

2009
Anja Pilz Gerhard Paass

One major problem in text mining and semantic retrieval is that detected entity mentions have to be assigned to the true underlying entity. The ambiguity of a name results from both the polysemy and synonymy problem, as the name of a unique entity may be written in variant ways and different unique entities may have the same name. The term “bush” for instance may refer to a woody plant, a mecha...

Journal: :CoRR 2016
Klaus M. Frahm Samer El Zant Katia Jaffrès-Runser Dima Shepelyansky

Geopolitics focuses on political power in relation to geographic space. Interactions among world countries have been widely studied at various scales, observing economic exchanges, world history or international politics among others. This work exhibits the potential of Wikipedia mining for such studies. Indeed, Wikipedia stores valuable finegrained dependencies among countries by linking webpa...

Journal: :J. UCS 2012
Jong Wook Kim Ashwin Kashyap Sandilya Bhamidipati

Proper representation of the meaning of texts is crucial for enhancing many data mining and information retrieval tasks, including clustering, computing semantic relatedness between texts, and searching. Representing of texts in the concept-space derived from Wikipedia has received growing attention recently. This concept-based representation is capable of extracting semantic relatedness betwee...

Journal: :J. Inf. Sci. Eng. 2013
Wen-Teng Yang Hung-Yu Kao

Identifying the semantic relatedness of two words is an important task for the information retrieval, natural language processing, and text mining. However, due to the diversity of meaning for a word, the semantic relatedness of two words is still hard to precisely evaluate under the limited corpora. Nowadays, Wikipedia is now a huge and wiki-based encyclopedia on the internet that has become a...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید