All Sampling Methods Produce Outliers

نویسندگان

چکیده

Given a computable probability measure $P$ over natural numbers or infinite binary sequences, there is no computable, randomized method that can produce an arbitrarily large sample such none of its members are outliers . In addition, given predicate notation="LaTeX">$\gamma $ , the length smallest program computes complete extension less than size domain plus amount information has with halting sequence.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

All Sampling Methods Produce Outliers

Given a computable probability measure P over natural numbers or infinite binary sequences, there is no method that can produce an arbitrarily large sample such that all its members are typical of P . This paper also contains upper bounds on the minimal encoding length of a predicate (over the set of natural numbers) consistent with another predicate over a finite domain.

متن کامل

Appendix a Methods for Identifying Data Outliers

Only extremely large rates are flagged, not extremely small ones, because only large values will have a major influence on statistics involving pounds of pesticide use. What value to use for the maximum rate in each criterion is somewhat arbitrary; the value determines how conservative one wants to be. We chose maximum rates to be close to what were considered obvious outliers by a group of sci...

متن کامل

Detecting sampling outliers and sampling heterogeneity when catch-at-length is estimated using the ratio estimator

Measuring fish on board fishing vessels or at fish markets to collect data for stock assessment purposes is one of the most straightforward actions carried out by fisheries scientists worldwide. However, such samples are not straightforward to handle and analyse because of their vector-type structure. A generic tool that allows investigation in any multinomial-like sampling scheme is provided, ...

متن کامل

Sampling Methods for Ilp

This paper is concerned with problems that arise when submitting large quantities of data to analysis by an Inductive Logic Programming (ILP) system. Complexity arguments usually make it prohibitive to analyse such datasets in their entirety. We examine two schemes that allow an ILP system to construct theories by sampling from this large pool of data. The rst, \subsampling", is a single-sample...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Information Theory

سال: 2021

ISSN: ['0018-9448', '1557-9654']

DOI: https://doi.org/10.1109/tit.2021.3109779