Faster Sublinear Algorithms using Conditional Sampling

نویسندگان

  • Themis Gouleakis
  • Christos Tzamos
  • Manolis Zampetakis
چکیده

A conditional sampling oracle for a probability distribution D returns samples from the conditional distribution of D restricted to a specified subset of the domain. A recent line of work [CFGM13, CRS14] has shown that having access to such a conditional sampling oracle requires only polylogarithmic or even constant number of samples to solve distribution testing problems like identity and uniformity. This significantly improves over the standard sampling model where polynomially many samples are necessary. Inspired by these results, we introduce a computational model based on conditional sampling to develop sublinear algorithms with exponentially faster runtimes compared to standard sublinear algorithms. We focus on geometric optimization problems over points in high dimensional Euclidean space. Access to these points is provided via a conditional sampling oracle that takes as input a succinct representation of a subset of the domain and outputs a uniformly random point in that subset. We study two well studied problems: k-means clustering and estimating the weight of the minimum spanning tree. In contrast to prior algorithms for the classic model, our algorithms have time, space and sample complexity that is polynomial in the dimension and polylogarithmic in the number of points. Finally, we comment on the applicability of the model and compare with existing ones like streaming, parallel and distributed computational models.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Faster Algorithms for Testing under Conditional Sampling

There has been considerable recent interest in distribution-tests whose run-time and sample requirements are sublinear in the domain-size k. We study two of the most important tests under the conditional-sampling model where each query specifies a subset S of the domain, and the response is a sample drawn from S according to the underlying distribution. For identity testing, which asks whether ...

متن کامل

Sublinear Time Orthogonal Tensor Decomposition

A recent work (Wang et. al., NIPS 2015) gives the fastest known algorithms for orthogonal tensor decomposition with provable guarantees. Their algorithm is based on computing sketches of the input tensor, which requires reading the entire input. We show in a number of cases one can achieve the same theoretical guarantees in sublinear time, i.e., even without reading most of the input tensor. In...

متن کامل

Constructing Sublinear Expectations on Path Space

We provide a general construction of time-consistent sublinear expectations on the space of continuous paths. It yields the existence of the conditional G-expectation of a Borel-measurable (rather than quasi-continuous) random variable, a generalization of the random Gexpectation, and an optional sampling theorem that holds without exceptional set. Our results also shed light on the inherent li...

متن کامل

The sparse fourier transform : theory & practice

The Fourier transform is one of the most fundamental tools for computing the frequency representation of signals. It plays a central role in signal processing, communications, audio and video compression, medical imaging, genomics, astronomy, as well as many other areas. Because of its widespread use, fast algorithms for computing the Fourier transform can benefit a large number of applications...

متن کامل

Sublinear Algorithms in the External Memory Model

We initiate the study of sublinear-time algorithms in the external memory model [Vit01]. In this model, the data is stored in blocks of a certain size B, and the algorithm is charged a unit cost for each block access. This model is well-studied, since it reflects the computational issues occurring when the (massive) input is stored on a disk. Since each block access operates on B data elements ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017