Query Replication in Distributed Information Systems with Autonomous Participants
نویسندگان
چکیده
We consider Distributed Information Systems with Autonomous Participants (DISAP), i.e., participants (consumers and providers) may have special interests towards queries and other participants. Recent applications of DISAP on the Internet have emerged to share data, services, or computing resources at an unprecedented scale (e.g. SETI@home). With autonomous participants, the only way to avoid a participant to voluntarily leave the system is to satisfy its interests when allocating queries. But, participants’ satisfaction may also be badly affected by other participants’ failures or comportment. In this context, replicating queries is useful to address two different problems: tolerate providers’ failures and deal with Byzantine providers. In this paper, we make the following main contributions. First, we formalize the query allocation problem over faulty participants in the context of DISAP. Second, we define participant’s satisfaction and define a notion of global satisfaction, which considers participants’ satisfaction and their probability of failure. Third, we propose a query replication algorithm, SbQR, which deals with the participants’ failures by deciding on-line whether a query should be replicated and at which rate. Fourth, we propose another query replication algorithm, called SbQR+, which generalizes SbQR with the goal of prioritizing critical queries. Finally, we implemented both algorithms and compared them to the popular baseline algorithm. The results demonstrate that our algorithms significantly outperform the baseline algorithm from the performance and satisfaction points of view. In particular, SbQR+ is excellent at choosing the queries that must be replicated to guarantee both participants’ satisfaction and good system performance. Key-words: Distributed information systems, participants’ intentions, autonomous participants, participants’ satisfaction, probability of participants’ failure, query replication in ria -0 03 83 32 1, v er si on 1 12 M ay 2 00 9 Réplication de Requêtes dans les Systèmes d’Information Distribués avec des Participants Autonomes Résumé : Nous considérons des Systèmes Distribués d’Information dont participants sont autonomes (DISAP, pour ses initiales en anglais), i.e. les participants (consommateurs et fornisseurs) peuvent avoir des intérêts particuliers envers les requêtes et les autres participants. Des applications récentes, sur les DISAP, ont vu le jour dans l’Internet comme pour objectif de partager de données, de services ou de ressources à une très grande échelle (e.g. SETI@home). Avec des participants autonomes, la seule façon d’éviter qu’un participant quitte le système par mécontentement est en satisfaisant ses intérêts au moment d’allouer les requêtes. Cependant, la satisfaction des participants peut être influencée par le comportement ou les pannes des autres participants. Dans ce contexte, la réplication de requêtes est utile pour adresser deux problèmes différents: tolérer les pannes des participants et traîter avec des partipants malicieux (i>e>, Byzantine). Dans cet article, nous faisons les contributions suivantes. Primo, nous formalisons le problème d’allocation de requêtes sur des participants souceptibles de tomber en panne dans le contexte de DISAP. Secondo, nous définons la satisfaction des participants dans les DISAP et définons aussi une notion de satisfaction globle, qui considère al satisfaction et la probabilité de panne des participants. Tertio, nous proposons SbQR, un algorithme qui tolère les pannes des participants en décidant à la volée si une requête doit être répliquée et combien de fois. Nous proposons aussi SbQR+ un algortihme de réplication de requêtes, qui généralise SbQR avec l’objectif de favoriser les requêtes critiques pour les consommateurs. Finalement, nous implementons nos deux algorithmes et les comparons à l’algorithme de base les plus utilisé dans nos jours. Les résultats montrent que nos algorithmes sont beaucoup plus performants dès le point de vue de performance du système ainsi que dès le point de vue de la satisfaction des participants. Mots-clés : Systèmes d’information distribués, participants autonomes, intentions des participants, satisfaction des participants, probabilité de panne des participants, réplication de requêtes. in ria -0 03 83 32 1, v er si on 1 12 M ay 2 00 9 Satisfaction-based Query Replication 3
منابع مشابه
A Satisfaction Balanced Query Allocation Process for Distributed Information Systems
We consider a distributed information system that allows autonomous consumers to query autonomous providers. We focus on the problem of query allocation from a new point of view, by considering consumers and providers’ satisfaction in addition to query load. We define satisfaction as a long-run notion based on the consumers and providers’ intentions. Intuitively, a participant should obtain goo...
متن کاملSQLB: A Query Allocation Framework for Autonomous Consumers and Providers
In large-scale distributed information systems, where participants are autonomous and have special interests for some queries, query allocation is a challenge. Much work in this context has focused on distributing queries among providers in a way that maximizes overall performance (typically throughput and response time). However, preserving the participants’ interests is also important. In thi...
متن کاملData Sharing in DHT Based P2P Systems
The evolution of peer-to-peer (P2P) systems triggered the building of large scale distributed applications. The main application domain is data sharing across a very large number of highly autonomous participants. Building such data sharing systems is particularly challenging because of the “extreme” characteristics of P2P infrastructures: massive distribution, high churn rate, no global contro...
متن کاملSemantic Loss in Query Reformulation in Dynamic Distributed Environments
Dynamic environments are descentralized systems that provide users with querying capabilities over a set of heterogeneous, distributed and autonomous data sources. Data Integration Systems, Peer Data Management Systems (PDMS) and Dataspaces are examples of such systems. They are composed by data sources (peers) that belong to a specific domain and are linked to each other by mappings (correspon...
متن کاملConjunctive Query Answering in Distributed Ontology Systems for Ontologies with Large OWL ABoxes
We present a query processing procedure for conjunctive queries in distributed ontology systems where a large ontology is divided into ontology fragments that are later distributed over a set of autonomous nodes. We focus on ontologies with large ABoxes. The query processing procedure determines and retrieves the facts that are relevant to answering a given query from other nodes, then construc...
متن کامل