Continuous Multi-Source Information Gathering and Classifi–

نویسنده

  • R. Steinberger
چکیده

This paper describes a fully functional prototype of a multi-component system that allows users to retrieve, store and search documents from a variety of publicly available information sources, in a variety of languages, on any subject domain users may be interested in. The system integrates a crawler, which selects and downloads potentially relevant documents, a cleaning tool which removes irrelevant information from the retrieved documents (e.g. advertisements), a document filter, a document classifier, and several other tools which extract meta-information from texts such as titles and keywords. Unlike ad-hoc search engines, the system satisfies long-term information needs of users since it continuously collects documents they may be interested in. The functionality of the system has been evaluated by comparing it with a traditional manual press clipping service, and it proved to give good results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Continuous Foraging and Information Gathering in a Multi-Agent Team

We are interested in continuous foraging with multi-agent teams, where resources are replenished over time, and the goal is to maximize the rate of foraging. Existing algorithms for continuous foraging and area sweeping typically consider homogeneous agents. We are interested in heterogeneous teams, where agents have radically different capabilities. In particular, we consider two types of agen...

متن کامل

Deriving the Exact Cost Function for a Two-Level Inventory System with Information Sharing

In this paper we consider a two-level inventory system with one warehouse and one retailer with information exchange. Transportation times are constant and retailer faces independent Poisson demand. The retailer applies continuous review (R,Q)-policy. The supplier starts with m initial batches (of size Q), and places an order to an outside source immediately after the retailer’s inventory posit...

متن کامل

Research of Blind Signals Separation with Genetic Algorithm and Particle Swarm Optimization Based on Mutual Information

Blind source separation technique separates mixed signals blindly without any information on the mixing system. In this paper, we have used two evolutionary algorithms, namely, genetic algorithm and particle swarm optimization for blind source separation. In these techniques a novel fitness function that is based on the mutual information and high order statistics is proposed. In order to evalu...

متن کامل

Case-Based Reasoning in Support of Intelligence Analysis

Open source intelligence analysts routinely use the web as a source of information related to their specific taskings. Effective information gathering on the web, despite the progress of conventional search engines, is a complex activity requiring some planning, text processing, and interpretation of extracted data to find information relevant to a major intelligence task or subtask (Knoblock, ...

متن کامل

Multi-Agent Continuous Transportation with Online Balanced Partitioning

We introduce the concept of continuous transportation task to the context of multi-agent systems. A continuous transportation task is one in which a multi-agent team visits a number of fixed locations, picks up objects, and delivers them to a transportation hub. The goal is to maximize the rate of transportation while the objects are replenished over time . In this extended abstract, we present...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002