ELEN6887 Lecture 13: Maximum Likelihood Estimation and Complexity Regularization
نویسنده
چکیده
Yi i.i.d. ∼ pθ∗ , i = {1, . . . , n} where θ∗ ∈ Θ. We can view pθ∗ as a member of a parametric class of distributions, P = {pθ}θ∈Θ. Our goal is to use the observations {Yi} to select an appropriate distribution (e.g., model) from P. We would like the selected distribution to be close to pθ in some sense. We use the negative log-likelihood loss function, defined as l(θ, Yi) = − log pθ(Yi). The empirical risk is R̂n(θ) = − 1 n n ∑
منابع مشابه
ELEN6887 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization
Yi i.i.d. ∼ pθ∗ , i = {1, . . . , n} where θ∗ ∈ Θ. We can view pθ∗ as a member of a parametric class of distributions, P = {pθ}θ∈Θ. Our goal is to use the observations {Yi} to select an appropriate distribution (e.g., model) from P. We would like the selected distribution to be close to pθ in some sense. We use the negative log-likelihood loss function, defined as l(θ, Yi) = − log pθ(Yi). The e...
متن کاملELEN6887 Lecture 15: Denoising Smooth Functions with Unknown Smoothness
Lipschitz functions are interesting, but can be very rough (these can have many kinks). In many situations the functions can be much smoother. This is how you would model the temperature inside a museum room for example. Often we don’t know how smooth the function might be, so an interesting question is if we can adapt to the unknown smoothness. In this lecture we will use the Maximum Complexit...
متن کاملELEN6887 Lecture 14: Denoising Smooth Functions with Unknown Smoothness
Lipschitz functions are interesting, but can be very rough (these can have many kinks). In many situations the functions can be much smoother. This is how you would model the temperature inside a museum room for example. Often we don’t know how smooth the function might be, so an interesting question is if we can adapt to the unknown smoothness. In this lecture we will use the Maximum Complexit...
متن کاملELEN6887 Lecture 12: Maximum Likelihood Estimation
We immediately notice the similarity between the empirical risk we had seen before and the negative loglikelihood. We will see that we can regard maximum likelihood estimation as our familiar minimal empirical risk when the loss function is chosen appropriately. In the meantime note that minimizing (1) yields our familiar square-error loss if Wi’s are Gaussian. If the Wi’s are Laplacian (pW (w)...
متن کاملECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization
Yi i.i.d. ∼ pθ∗ , i = {1, . . . , n} where θ∗ ∈ Θ. We can view pθ∗ as a member of a parametric class of distributions, P = {pθ}θ∈Θ. Our goal is to use the observations {Yi} to select an appropriate distribution (e.g., model) from P. We would like the selected distribution to be close to pθ in some sense. We use the negative log-likelihood loss function, defined as l(θ, Yi) = − log pθ(Yi). The e...
متن کامل