Random Sampling from Database Files: A Survey

نویسندگان

  • Frank Olken
  • Doron Rotem
چکیده

In this paper we survey known results on algorithms, data structures, and some applications of random sampling from databases. We first discuss various reasons for sampling from databases, and for inclusion of sampling as a DBMS operator. We consider basic sampling algorithms, sampling from trees, sampling from hash tables, and auxiliary memory resident index information to facilitate sampling.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Random Sampling from B+ Trees

We consider the design and analysis of algorithms to retrieve simple random samples from databases. Specifically, we examine simple random sampling from B+ tree files. Existing methods of sampling from B+ trees, require the use of auxiliary rank information in the nodes of the tree. Such modified B+ tree files are called “ranked B+ trees”. We compare sampling from ranked Bt tree files, with new...

متن کامل

Random Sampling from Databases - A Survey

This paper reviews recent literature on techniques for obtaining random samples from databases. We begin with a discussion of why one would want to include sampling facilities in database management systems. We then review basic sampling techniques used in construct-join are then described. We then describe sampling for estimation of aggregates (e.g., the size of query results). Here we discuss...

متن کامل

Poster 2016: The effect and value of sublingual immunotherapy: a patient survey

Methods A survey was sent to a random sample of 1,400 patients obtained from the AAOL newsletter database of 4,500 patients. The 20 question survey assessed patient demographics, perceived value of treatment, medication use, health and utilization ratings, compliance, school/work attendance, hospitalizations and unplanned physician visits and health related measures such as energy, sleep, and e...

متن کامل

A Bayesian Nominal Regression Model with Random Effects for Analysing Tehran Labor Force Survey Data

Large survey data are often accompanied by sampling weights that reflect the inequality probabilities for selecting samples in complex sampling. Sampling weights act as an expansion factor that, by scaling the subjects, turns the sample into a representative of the community. The quasi-maximum likelihood method is one of the approaches for considering sampling weights in the frequentist framewo...

متن کامل

Sampling design for an integrated socioeconomic and ecological survey by using satellite remote sensing and ordination.

Environmental variability is an important risk factor in rural agricultural communities. Testing models requires empirical sampling that generates data that are representative in both economic and ecological domains. Detrended correspondence analysis of satellite remote sensing data were used to design an effective low-cost sampling protocol for a field study to create an integrated socioeconom...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1990