Information filtering based on wiki index database

نویسندگان

  • Alexander V. Smirnov
  • Andrew Krizhanovsky
چکیده

In this paper we present a profile-based approach to information filtering by an analysis of the content of text documents. The Wikipedia index database is created and used to automatically generate the user profile from the user’s document collection. The problem-oriented Wikipedia subcorpora are created (using knowledge extracted from the user profile) for each topic of user interests. The index databases of these subcorpora are applied to filtering information flow (e.g., mails, news). Thus, the analyzed texts are classified into several topics explicitly presented in the user profile. The paper concentrates on the indexing part of the approach. The architecture of an application implementing the Wikipedia indexing is described. The indexing method is evaluated using the Russian and Simple English Wikipedia.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Index wiki database: design and experiments

With the fantastic growth of Internet usage, information search in documents of a special type called a “wiki page” that is written using a simple markup language, has become an important problem. This paper describes the software architectural model for indexing wiki texts in three languages (Russian, English, and German) and the interaction between the software components (GATE, Lemmatizer, a...

متن کامل

Towards an Inquiry-Based Language Learning: Can a Wiki Help?

Wiki use may help EFL instructors to create an effective learning environment for inquiry-based language teaching and learning. The purpose of this study was to investigate the effects of wikis on the EFL learners’ IBL process. Forty-nine EFL students participated in the study while they conducted research projects in English. The Non-wiki group (n = 25) received traditional inquiry instr...

متن کامل

Automatic Population and Updating of a Semantic Wiki-based Configuration Management Database

This paper describes our work on designing and implementing a component for automatically integrating and updating information about configuration items into a Semantic Wiki-based configuration management database. The presented solution uses technology for information gathering which is built-in or available for most current mainstream operating systems. By using Semantic Wiki technology, e.g....

متن کامل

A New Similarity Measure Based on Item Proximity and Closeness for Collaborative Filtering Recommendation

Recommender systems utilize information retrieval and machine learning techniques for filtering information and can predict whether a user would like an unseen item. User similarity measurement plays an important role in collaborative filtering based recommender systems. In order to improve accuracy of traditional user based collaborative filtering techniques under new user cold-start problem a...

متن کامل

Evaluating the status of agricultural articles of Iranian researchers at the Scopus citation database based on the Hirsch index

Background and aim: Today, the use of scientometric methods to evaluate the scientific outputs of researchers in various fields has been highly regarded and the Hirsch index (h-index) is one of the most important scientometric indices due to the simultaneous measurement of quantity and quality of scientific outputs. Therefore, the aim of this study was to evaluate the status of Iranian agricult...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/0804.2354  شماره 

صفحات  -

تاریخ انتشار 2008