Supporting Approximate Similarity Queries with Quality Guarantees in P2P Systems
نویسندگان
چکیده
In this paper we study how to support similarity queries in peer-to-peer (P2P) systems. Such queries ask for the most relevant objects in a P2P network, where the relevance is based on a predefined similarity function; the user is interested in obtaining objects with the highest relevance. Retrieving all objects and computing the exact answer over a large-scale network is impractical. We propose a novel approximate answering framework which computes an answer by visiting only a subset of network peers. Users are presented with progressively refined answers consisting of the best objects seen so far, together with continuously improving quality guarantees providing feedback about the progress of the search. We develop statistical techniques to determine quality guarantees in this framework. We propose mechanisms to incorporate quality estimators into the search process. Our work makes it possible to implement similarity search as a new method of accessing data from a P2P network, and shows how this can be achieved efficiently.
منابع مشابه
Supporting Filename Partial Matches in Structured Peer-to-Peer Overlay
In recent years, research issues associated with peer-to-peer (P2P) systems have been discussed widely. To resolve the file-availability problem and improve the workload, a method called the Distributed Hash Table (DHT) has been proposed. However, DHT-based systems in structured architectures cannot support efficient queries, such as a similarity query, range query, and partial-match query, due...
متن کاملFlexible Information Discovery with Guarantees in Decentralized Distributed Systems
OF THE DISSERTATION Flexible Information Discovery with Guarantees in Decentralized Distributed Systems by CRISTINA SIMONA SCHMIDT Dissertation Director: Professor Manish Parashar Recent years have seen increasing interest in Peer-to-Peer (P2P) information sharing environments. The P2P computing paradigm enables entities at the edges of the network to directly interact as equals (or peers) and ...
متن کاملEfficient Processing of XPath Queries with Structured Overlay Networks
Non-trivial search predicates beyond mere equality are at the current focus of P2P research. Structured queries, as an important type of non-trivial search, have been studied extensively mainly for unstructured P2P systems so far. As unstructured P2P systems do not use indexing, structured queries are very easy to implement since they can be treated equally to any other type of query. However, ...
متن کاملProcessing Skyline Queries in P2P Systems
Efficient query processing in P2P systems poses a variety of challenges mainly resulting from the strict decentralization and limited knowledge. Particularly with regard to queries involving ranking, top-N or skylines, existing approaches for centralized systems cannot be applied easily to P2P environments. In this paper, we focus on the problem of efficiently processing skyline queries in larg...
متن کاملAn Efficient Architecture for Information Retrieval in P2P Context Using Hypergraph
Peer-to-peer (P2P) Data-sharing systems now generate a significant portion of Internet traffic. P2P systems have emerged as an accepted way to share enormous volumes of data. Needs for widely distributed information systems supporting virtual organizations have given rise to a new category of P2P systems called schema-based. In such systems each peer is a database management system in itself, e...
متن کامل