Continuous Multi-Source Information Gathering and Classifi–
نویسنده
چکیده
This paper describes a fully functional prototype of a multi-component system that allows users to retrieve, store and search documents from a variety of publicly available information sources, in a variety of languages, on any subject domain users may be interested in. The system integrates a crawler, which selects and downloads potentially relevant documents, a cleaning tool which removes irrelevant information from the retrieved documents (e.g. advertisements), a document filter, a document classifier, and several other tools which extract meta-information from texts such as titles and keywords. Unlike ad-hoc search engines, the system satisfies long-term information needs of users since it continuously collects documents they may be interested in. The functionality of the system has been evaluated by comparing it with a traditional manual press clipping service, and it proved to give good results.
منابع مشابه
Continuous Foraging and Information Gathering in a Multi-Agent Team
We are interested in continuous foraging with multi-agent teams, where resources are replenished over time, and the goal is to maximize the rate of foraging. Existing algorithms for continuous foraging and area sweeping typically consider homogeneous agents. We are interested in heterogeneous teams, where agents have radically different capabilities. In particular, we consider two types of agen...
متن کاملDeriving the Exact Cost Function for a Two-Level Inventory System with Information Sharing
In this paper we consider a two-level inventory system with one warehouse and one retailer with information exchange. Transportation times are constant and retailer faces independent Poisson demand. The retailer applies continuous review (R,Q)-policy. The supplier starts with m initial batches (of size Q), and places an order to an outside source immediately after the retailer’s inventory posit...
متن کاملResearch of Blind Signals Separation with Genetic Algorithm and Particle Swarm Optimization Based on Mutual Information
Blind source separation technique separates mixed signals blindly without any information on the mixing system. In this paper, we have used two evolutionary algorithms, namely, genetic algorithm and particle swarm optimization for blind source separation. In these techniques a novel fitness function that is based on the mutual information and high order statistics is proposed. In order to evalu...
متن کاملCase-Based Reasoning in Support of Intelligence Analysis
Open source intelligence analysts routinely use the web as a source of information related to their specific taskings. Effective information gathering on the web, despite the progress of conventional search engines, is a complex activity requiring some planning, text processing, and interpretation of extracted data to find information relevant to a major intelligence task or subtask (Knoblock, ...
متن کاملMulti-Agent Continuous Transportation with Online Balanced Partitioning
We introduce the concept of continuous transportation task to the context of multi-agent systems. A continuous transportation task is one in which a multi-agent team visits a number of fixed locations, picks up objects, and delivers them to a transportation hub. The goal is to maximize the rate of transportation while the objects are replenished over time . In this extended abstract, we present...
متن کامل