On the Sample Complexity of Compressed Counting
نویسنده
چکیده
The problem of “scaling up for high dimensional data and high speed data streams” is among the “ten challenging problems in data mining research”[36]. This paper is devoted to estimating entropy of data streams. Mining data streams[19, 4, 1, 29] in (e.g.,) 100 TB scale databases has become an important area of research, e.g., [10, 1], as network data can easily reach that scale[36]. Search engines are a typical source of data streams[4]. Consider the Turnstile stream model[29]. The input stream at = (it, It), it ∈ [1, D] arriving sequentially describes the underlying signal A, meaning
منابع مشابه
A New Algorithm for Compressed Counting with Applications in Shannon Entropy Estimation in Dynamic Data
Efficient estimation of the moments and Shannon entropy of data streams is an important task in modern machine learning and data mining. To estimate the Shannon entropy, it suffices to accurately estimate the α-th moment with ∆ = |1 − α| ≈ 0. To guarantee that the error of estimated Shannon entropy is within a ν-additive factor, the method of symmetric stable random projections requires O ( 1 ν...
متن کاملCompressed Domain Scene Change Detection Based on Transform Units Distribution in High Efficiency Video Coding Standard
Scene change detection plays an important role in a number of video applications, including video indexing, searching, browsing, semantic features extraction, and, in general, pre-processing and post-processing operations. Several scene change detection methods have been proposed in different coding standards. Most of them use fixed thresholds for the similarity metrics to determine if there wa...
متن کاملOn Practical Algorithms for Entropy Estimation and the Improved Sample Complexity of Compressed Counting
Abstract The long-standing problem of Shannon entropy estimation in data streams (assuming the strict Turnstile model) is now an easy task by using the technique proposed in this paper. Essentially speaking, in order to estimate the Shannon entropy with a guaranteed ν-additive accuracy, it suffices to estimate the αth frequency moment, where α = 1−∆, with a guaranteed ǫ-multiplicative accuracy,...
متن کاملPost-Operative Time Effects after Sciatic Nerve Crush on the Number of Alpha Motoneurons, Using a Sterological Counting Method (Disector)
There are extensive evidences that show axonal processes of the nervous system (peripheral and/or central) may be degenerated after nerve injuries. Wallerian degeneration and chromatolysis are the most conspicuous phenomena that occur in response to injuries. In this research, the effects of post-operative time following sciatic nerve crush on the number of spinal motoneurons were investigated....
متن کاملMeasurement of radium micro-precipitates using alpha spectrometry and total alpha counting methods
Background: This study consists of two parts. The first part deals with both qualitative and quantitative analysis of 226Ra using alpha spectrometry measurement method. In the second part, the percentage of radioactive equilibrium between 226 Ra and its daughter products were determined by alpha spectrometry and total alpha measurement system after elapsed time of 15 days from precipitation. Ma...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/0910.1403 شماره
صفحات -
تاریخ انتشار 2009