wikipedia mining

نتایج جستجو برای: wikipedia mining

تعداد نتایج: 92181 فیلتر نتایج به سال:

THUIR at NTCIR-9 INTENT Task

2011

Yufei Xue Fei Chen Tong Zhu Chao Wang Zhichao Li Yiqun Liu Min Zhang Yijiang Jin Shaoping Ma

This is the first year IR group of Tsinghua University (THUIR) participates in NTCIR. We register the INTENT task and focus on the Chinese topics of subtopic mining and document ranking subtask. In our experiments, we try to mine subtopics from different resources, namely query recommendation, Wikipedia and the query-URL bipartite graph which is constructed by clickthrough data. We also develop...

متن کامل

Wikipedia at school and university – future teachers about wikipedia

Journal: :Osvitolohiya 2015

متن کامل

A Generalized Flow-Based Method for Research on Acted Relationships in Wikipedia

2014

Prudhvi Raj

We all concentrate on measuring human relationships among sets of things with Wikipedia as their internet pages could be regarded as specific things. Two kinds of human relationships usually are really exist among a couple of things with Wikipedia, a great explicit partnership will be manifested by way of a single link between a couple of internet pages to the things, in addition to a great imp...

متن کامل

Mapping Dependency Relationships into Semantic Frame Relationships

2013

N. H. N. D. de Silva C. S. N. J. Fernando M. K. D. T. Maldeniya D. N. C. Wijeratne A. S. Perera B. Goertzel

We describe the refactoring process of the RelEx2Frame component of OpenCog AGI Framework, a method for expanding concept variables used in RelEx and automatic generation of a common sense knowledge base specifically with relation to concept relationships. The wellknown Drools rule engine is used instead of hand-coded rules; an asynchronous concurrent architecture and an indexing mechanism are ...

متن کامل

Clustering XML Documents Using Closed Frequent Subtrees: A Structural Similarity Approach

2007

Sangeetha Kutty Tien Tran Richi Nayak Yuefeng Li

This paper presents the experimental study conducted over the INEX 2007 Document Mining Challenge corpus employing a frequent subtree-based incremental clustering approach. Using the structural information of the XML documents, the closed frequent subtrees are generated. A matrix is then developed representing the closed frequent subtree distribution in documents. This matrix is used to progres...

متن کامل

Unsupervised Feature Generation using Knowledge Repositories for Effective Text Categorization

2010

R. Rajendra Prasath Sudeshna Sarkar

We propose an unsupervised feature generation algorithm using the repositories of human knowledge for effective text categorization. Conventional bag of words (BOW) depends on the presence / absence of keywords to classify the documents. To understand the actual context behind these keywords, we use knowledge concepts / hyperlinks from external knowledge sources through content and structure mi...

متن کامل

Learning Fine-grained Relations from Chinese User Generated Categories

2017

Chengyu Wang Yan Fan Xiaofeng He Aoying Zhou

User generated categories (UGCs) are short texts that reflect how people describe and organize entities, expressing rich semantic relations implicitly. While most methods on UGC relation extraction are based on pattern matching in English circumstances, learning relations from Chinese UGCs poses different challenges due to the flexibility of expressions. In this paper, we present a weakly super...

متن کامل

Extraction des connaissances à partir du Web pour la recherche des images géoréférencées

2009

Houda Bouamor

Geolocalized databases are becoming necessary in a wide variety of application domains. The manual creation of such databases is an expensive operation which stimulated the interest for the automation of their construction, by mining geographic information from the Web. In this article, we present and evaluate a new automated approach for creating a geographical database. Our technique uses Wik...

متن کامل

The WikEd Error Corpus: A Corpus of Corrective Wikipedia Edits and Its Application to Grammatical Error Correction

2014

Roman Grundkiewicz Marcin Junczys-Dowmunt

This paper introduces the freely available WikEd Error Corpus. We describe the data mining process from Wikipedia revision histories, corpus content and format. The corpus consists of more than 12 million sentences with a total of 14 million edits of various types. As one possible application, we show that WikEd can be successfully adapted to improve a strong baseline in a task of grammatical e...

متن کامل

A Survey on Discovery of Part-Whole Relations with Knowledge-Base

2014

Claudia Reynolds G. Naveen Sundar

Mining the data from large-plain text enhances the retrieval of information from resources. This paper describes about the automatic discovery of part-whole patterns from the texts using knowledge. The Parts are found by learning semantic constraints and linking documents to the knowledge-base. Knowledge Discovery in text is a potential method, which automatically extracts the concepts and conc...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید