نتایج جستجو برای: data sampling
تعداد نتایج: 2525723 فیلتر نتایج به سال:
We give near-optimal distributions for the sparsification of large m n matrices, where m ! n, for example representing n observations over m attributes. Our algorithms can be applied when the non-zero entries are only available as a stream, i.e., in arbitrary order, and result in matrices which are not only sparse, but whose values are also highly compressible. In particular, algebraic operatio...
OBJECTIVES To explore classification rules based on data mining methodologies which are to be used in defining strata in stratified sampling of healthcare providers with improved sampling efficiency. METHODS We performed k-means clustering to group providers with similar characteristics, then, constructed decision trees on cluster labels to generate stratification rules. We assessed the varia...
Proper data selection for training a speech recognizer can be important for reducing costs of developing systems on new tasks and exploratory experiments, but it is also useful for efficient leveraging of the increasingly large speech resources available for training large vocabulary systems. In this work, we investigate various sampling methods, comparing the likelihood criterion to new acoust...
As computers and scientific instruments become more complicated and more powerful (Moore’s Law), we can perform astronomical observations never before contemplated. As larger data volumes are acquired, as more complex instruments are designed, and as observatories are placed in distant space locations with constrained downlink capacity, the need for automated, robust image processing tools will...
We present a new method for sampling the volume rendering integral in volume raycasting where samples are correlated based on transfer function content and data set values. This has two major advantages. First, visual artifacts stemming from structured noise, such as wood grain, can be reduced. Second, we will show that the volume data does no longer need to be available during the rendering ph...
In this paper, we focus on energy disaggregation at low-sampling rates (at 6sec and 1min) and use only active power measurements for training and testing. Specifically, we develop two algorithms: one is a low-complexity, supervised approach based on Decision Trees and another is an unsupervised method based on Dynamic Time Warping. Both proposed algorithms share common pre-classification steps....
In this paper the author discusses how sampling access and recruitment problems encountered in an in-depth interview study heightened her sensitivity to “borderline illegitimate” data. The term illegitimate data usually refers to the data collected during a covert study, whereas “legitimate” data are collected during an overt study. Hence, data collected during any nonconsented period(s) of an ...
Missing variable models are typical benchmarks for new computational techniques in that the ill-posed nature of missing variable models offer a challenging testing ground for these techniques. This was the case for the EM algorithm and the Gibbs sampler, and this is also true for importance sampling schemes. A population Monte Carlo scheme taking advantage of the latent structure of the problem...
Traditional classification algorithms, in many times, perform poorly on imbalanced data sets in which some classes are heavily outnumbered by the remaining classes. For this kind of data, minority class instances, which are usually much more of interest, are often misclassified. The paper proposes a method to deal with them by changing class distribution through oversampling at the borderline b...
In this work, we present a comprehensive treatment of weighted random sampling (WRS) over data streams. More precisely, we examine two natural interpretations of the item weights, describe an existing algorithm for each case ([2,4]), discuss sampling with and without replacement and show adaptations of the algorithms for several WRS problems and evolving data streams.
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید