نتایج جستجو برای: wikipedia mining
تعداد نتایج: 92181 فیلتر نتایج به سال:
This paper proposes a statistical methodology for mining Wikipedia to discover characteristics associated with life outcomes. The methodology is demonstrated using first names and childhood environment. By comparing over 35,000 Wikipedia biographies against spatially and tempo rally matched census data, we show that individuals with rare names are twice as likely to appear in Wikipedia (RR 2.43...
This paper describes the generation of temporally anchored infobox attribute data from the Wikipedia history of revisions. By mining (attribute, value) pairs from the revision history of the English Wikipedia we are able to collect a comprehensive knowledge base that contains data on how attributes change over time. When dealing with the Wikipedia edit history, vandalic and erroneous edits are ...
Wikipedia categories are a useful source of knowledge that is usually expressed in a noun-phrase that contains information about concepts of entities or relations among entities. In DBpedia KBs, they categorize their entities into Wikipedia categories using RDF triples. The RDF triples represent only categories of entities, but not concepts of entities or relations among entities despite the fa...
We consider the problem of visualizing and exploring a dataset about research publications from the fields of Learning Analytics (LA) and Educational Data Mining (EDM). Our approach is based on semantic annotation that associates publications from the dataset with Wikipedia topics. We present a visualization and exploration tool, called Paperista (www.uzrok.com/paperista), which presents these ...
Wikipedia-centric Knowledge Bases (KBs) such as YAGO and DBpedia store the hyperlinks between articles in Wikipedia using wikilink relations. While wikilinks are signals of semantic connection between entities, the meaning of such connection is most of the times unknown to KBs, e.g., for 89% of wikilinks in DBpedia no other relation between the entities is known. The task of discovering the exa...
There are several semantic sources that can be found in the Web that are either explicit, e.g. Wikipedia, or implicit, e.g. derived from Web usage data. Most of them are related to user generated content (UGC) or what is called today the Web 2.0. In this talk we show several applications of mining the wisdom of crowds behind UGC to improve search. We will show live demos to nd relations in the ...
Whether knowingly or otherwise, Wikipedia contributors reveal their interests and expertise through their contribution patterns. An analysis of Wikipedia edit histories shows that it is often possible to associate contributors with relatively small geographic regions, usually corresponding to where they were born or where they presently live. For many contributors, the geographic coordinates of...
This paper describes a multi-lingual concept network obtained automatically by mining for concepts and relations and exploiting a variety of sources of knowledge from Wikipedia. Concepts and their lexicalizations are extracted from Wikipedia pages. Relations are extracted from the category and page network, infoboxes and the body of the articles. The network consists of a central, language inde...
A well-recognized limitation of research on supervised sentence compression is the dearth of available training data. We propose a new and bountiful resource for such training data, which we obtain by mining the revision history of Wikipedia for sentence compressions and expansions. Using only a fraction of the available Wikipedia data, we have collected a training corpus of over 380,000 senten...
In this paper, we describe our approach to the Wikipedia Participation Challenge which aims to predict the number of edits a Wikipedia editor will make in the next 5 months. The best submission from our team, “zeditor”, achieved 41.7% improvement over WMF’s baseline predictive model and the final rank of 3rd place among 96 teams. An interesting characteristic of our approach is that only tempor...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید