Learning Mixtures of Gaussians

نویسنده

Sanjoy Dasgupta

چکیده

Mixtures of Gaussians are among the most fundamental and widely used statistical models. Current techniques for learning such mixtures from data are local search heuristics with weak performance guarantees. We present the first provably correct algorithm for learning a mixture of Gaussians. This algorithm is very simple and returns the true centers of the Gaussians to within the precision specified by the user, with high probability. It runs in time only linear in the dimension of the data and polynomial in the number of Gaussians.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PAC Learning Mixtures of Axis-Aligned Gaussians with No Separation Assumption

We propose and analyze a new vantage point for the learning of mixtures of Gaussians: namely, the PAC-style model of learning probability distributions introduced by Kearns et al. [13]. Here the task is to construct a hypothesis mixture of Gaussians that is statistically indistinguishable from the actual mixture generating the data; specifically, the KL divergence should be at most ǫ. In this s...

متن کامل

PAC Learning Mixtures of Gaussians with No Separation Assumption

We propose and analyze a new vantage point for the learning of mixtures of Gaussians: namely, the PAC-style model of learning probability distributions introduced by Kearns et al. [12]. Here the task is to construct a hypothesis mixture of Gaussians that is statistically indistinguishable from the actual mixture generating the data; specifically, the KL divergence should be at most 2. In this s...

متن کامل

PAC Learning Axis-Aligned Mixtures of Gaussians with No Separation Assumption

We propose and analyze a new vantage point for the learn-ing of mixtures of Gaussians: namely, the PAC-style model of learningprobability distributions introduced by Kearns et al. [12]. Here the taskis to construct a hypothesis mixture of Gaussians that is statistically in-distinguishable from the actual mixture generating the data; specifically,the KL divergence should be a...

متن کامل