Random Sampling from Pseudo-Ranked B+ Trees

نویسنده

  • Gennady Antoshenkov
چکیده

In the past, two basic approaches for sampling f5-om B+ trees have been suggested: sampling from the ranked trees and acceptance/rejection sampling i?om non-ranked trees. The first approach requires the entire root-to-leaf path to be updated with each insertion and deletion. The second has no update overhead, but incurs a high rejection rate for the compressed-key B+ trees commonly used in practice. In this paper we introduce a new sampling method based on pseudo-ranked B+ trees, which are B+ trees supplemented with information loosely describing the estimated rank limits. This new method exhibits a very small rejection rate while paying only a marginal cost of the tree update overhead. We also present comparative efficiency measurements of different methods that were run on production databases and on several prototype workload simulations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Random Sampling from B+ Trees

We consider the design and analysis of algorithms to retrieve simple random samples from databases. Specifically, we examine simple random sampling from B+ tree files. Existing methods of sampling from B+ trees, require the use of auxiliary rank information in the nodes of the tree. Such modified B+ tree files are called “ranked B+ trees”. We compare sampling from ranked Bt tree files, with new...

متن کامل

Ranked Set Sampling

This paper is intended to provide the reader with an introduction to ranked set sampling, a statistical technique for data collection that generally leads to more efficient estimators than competitors based on simple random samples. Methods for obtaining ranked set samples are described and the structural differences between ranked set samples and simple random samples are discussed. Properties...

متن کامل

A new family of Markov branching trees: the alpha-gamma model

We introduce a simple tree growth process that gives rise to a new two-parameter family of discrete fragmentation trees that extends Ford’s alpha model to multifurcating trees and includes the trees obtained by uniform sampling from Duquesne and Le Gall’s stable continuum random tree. We call these new trees the alpha-gamma trees. In this paper, we obtain their splitting rules, dislocation meas...

متن کامل

Random Sampling from Databases

Random Sampling from Databases by Frank Olken Doctor of Philosophy in Computer Science University of California at Berkeley Professor Michael Stonebraker, Chair In this thesis I describe e cient methods of answering random sampling queries of relational databases, i.e., retrieving random samples of the results of relational queries. I begin with a discussion of the motivation for including samp...

متن کامل

Weibull-Bayesian Estimation Based on Maximum Ranked Set Sampling with Unequal Samples

A modification of ranked set sampling (RSS) called maximum ranked set sampling with unequal sample (MRSSU) is considered for the Bayesian estimation of scale parameter α of the Weibull distribution. Under this method, we use Linex loss function, conjugate and Jeffreys prior distributions to derive the Bayesian estimate of α. In order to measure the efficiency of the obtained Bayesian estimates ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1992