نتایج جستجو برای: wikipedia mining
تعداد نتایج: 92181 فیلتر نتایج به سال:
This is the first year IR group of Tsinghua University (THUIR) participates in NTCIR. We register the INTENT task and focus on the Chinese topics of subtopic mining and document ranking subtask. In our experiments, we try to mine subtopics from different resources, namely query recommendation, Wikipedia and the query-URL bipartite graph which is constructed by clickthrough data. We also develop...
We all concentrate on measuring human relationships among sets of things with Wikipedia as their internet pages could be regarded as specific things. Two kinds of human relationships usually are really exist among a couple of things with Wikipedia, a great explicit partnership will be manifested by way of a single link between a couple of internet pages to the things, in addition to a great imp...
We describe the refactoring process of the RelEx2Frame component of OpenCog AGI Framework, a method for expanding concept variables used in RelEx and automatic generation of a common sense knowledge base specifically with relation to concept relationships. The wellknown Drools rule engine is used instead of hand-coded rules; an asynchronous concurrent architecture and an indexing mechanism are ...
This paper presents the experimental study conducted over the INEX 2007 Document Mining Challenge corpus employing a frequent subtree-based incremental clustering approach. Using the structural information of the XML documents, the closed frequent subtrees are generated. A matrix is then developed representing the closed frequent subtree distribution in documents. This matrix is used to progres...
We propose an unsupervised feature generation algorithm using the repositories of human knowledge for effective text categorization. Conventional bag of words (BOW) depends on the presence / absence of keywords to classify the documents. To understand the actual context behind these keywords, we use knowledge concepts / hyperlinks from external knowledge sources through content and structure mi...
User generated categories (UGCs) are short texts that reflect how people describe and organize entities, expressing rich semantic relations implicitly. While most methods on UGC relation extraction are based on pattern matching in English circumstances, learning relations from Chinese UGCs poses different challenges due to the flexibility of expressions. In this paper, we present a weakly super...
Geolocalized databases are becoming necessary in a wide variety of application domains. The manual creation of such databases is an expensive operation which stimulated the interest for the automation of their construction, by mining geographic information from the Web. In this article, we present and evaluate a new automated approach for creating a geographical database. Our technique uses Wik...
This paper introduces the freely available WikEd Error Corpus. We describe the data mining process from Wikipedia revision histories, corpus content and format. The corpus consists of more than 12 million sentences with a total of 14 million edits of various types. As one possible application, we show that WikEd can be successfully adapted to improve a strong baseline in a task of grammatical e...
Mining the data from large-plain text enhances the retrieval of information from resources. This paper describes about the automatic discovery of part-whole patterns from the texts using knowledge. The Parts are found by learning semantic constraints and linking documents to the knowledge-base. Knowledge Discovery in text is a potential method, which automatically extracts the concepts and conc...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید