نتایج جستجو برای: data sampling
تعداد نتایج: 2525723 فیلتر نتایج به سال:
This paper presents a data-driven cluster sampling framework for parsing scene images into generic regions (such as the sky, mountain and water) and objects (such as cows, horses and cars). We adopt generative models for both generic regions and objects, thus their likelihood probabilities are comparable and are learned under a common information projection principle. The inference algorithm fo...
We consider the problem of selecting non-zero entries of a matrix A in order to produce a sparse sketch of it, B, that minimizes A B 2. For large m n matrices, such that n m (for example, representing n observations over m attributes) we give sampling distributions that exhibit four important properties. First, they have closed forms computable from minimal information regarding A. Second, they...
Importance sampling algorithms are discussed in detail, with an emphasis on implicit sampling, and applied to data assimilation via particle filters. Implicit sampling makes it possible to use the data to find high-probability samples at relatively low cost, making the assimilation more efficient. A new analysis of the feasibility of data assimilation is presented, showing in detail why feasibi...
We consider estimation of arbitrary range partitioning of data values and ranking of frequently occurring items based on random sampling, within small number of samplings and prescribed accuracy. These problems arise in the context of parallel-processing of massive datasets, e.g. performed in data centers of Internet-scale cloud services and large-scale scientific computations. The range partit...
The reduction of the number of samples is a key issue in signal processing for mobile applications. We investigate the link between the smoothness properties of a signal and the number of samples that can be obtained through a level crossing sampling procedure. The algorithm is analyzed and an upper bound of the number of samples is obtained in the worst case. The theoretical results are illust...
We consider estimation of quantiles when data are generated from ranked set sampling. A new estimator is proposed and is shown to have a smaller asymptotic variance for all distributions. It is also shown that the optimal sampling strategy is to select observations with one fixed rank from different ranked sets. Both the optimal rank and the relative efficiency gain with respect to simple rando...
Reservoir sampling is a well-known technique for random sampling over data streams. In many streaming applications, however, an input stream may be naturally heterogeneous, i.e., composed of substreams whose statistical properties may also vary considerably. For this class of applications, the conventional reservoir sampling technique does not guarantee a statistically sufficient number of tupl...
Various approaches to extend bagging ensembles for class imbalanced data are considered. First, we review known extensions and compare them in a comprehensive experimental study. The results show that integrating bagging with under-sampling is more powerful than over-sampling. They also allow to distinguish Roughly Balanced Bagging as the most accurate extension. Then, we point out that complex...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید