似然和对数似然Likelihood & LogLikelihood

zoukankan html css js c++ java

似然和对数似然Likelihood & LogLikelihood
One of the most fundamental concepts of modern statistics is that of likelihood. In each of the discrete random variables we have considered thus far, the distribution depends on one or more parameters that are, in most statistical applications, unknown. In the Poisson distribution, the parameter is λ. In the binomial, the parameter of interest is p (since n is typically fixed and known).

Likelihood is a tool for summarizing the data’s evidence about unknown parameters. Let us denote the unknown parameter(s) of a distribution generically by θ. Since the probability distribution depends on θ, we can make this dependence explicit by writing f(x) as f(x ; θ). For example, in the Bernoulli distribution the parameter is θ =  π , and the distribution is

Once a value of X has been observed, we can plug this observed value x into f(x ; π ) and obtain a function of π only. For example, if we observe X = 1, then plugging x = 1 into (2) gives the function π . If we observe X = 0, the function becomes 1 − π .

Whatever function of the parameter we get when we plug the observed data x into f(x ; θ), we call that function thelikelihood function.

We write the likelihood function as

For example, suppose that X has a Bernoulli distribution with unknown parameter π . We can graph the probability distribution for any fixed value of π  . For example, if π = .5 we get this:

Now suppose that we observe a value of X, say X = 1. Plugging x = 1 into the distribution

For discrete random variables, a graph of the probability distribution f(x ; θ) has spikes at specific values of x, whereas a graph of the likelihood L(θ ; x) is a continuous curve (e.g. a line) over the parameter space, the domain of possible values for θ.

L(θ ; x) summarizes the evidence about θ contained in the event X = x. L(θ ; x) is high for values of θ that make X =x more likely, and small for values of θ that make X = x unlikely. In the Bernoulli example, observing X = 1 gives some (albeit weak) evidence that π  is nearer to 1 than to 0, so the likelihood for x = 1 rises as p moves from 0 to 1.

For example, if we observe

$L (π | x) = \frac{n!}{(n - x)! x!} π^{x} (1 - π)^{n - x} .$

Any multiplicative constant which does not depend on

$L (π | x) \propto π^{x} (1 - π)^{n - x} .$

Loglikelihood

In most cases, for various reasons, but often computational convenience, we work with the loglikelihood

$l (θ | x) = \log L (θ | x)$

$l (π | x) = x \log π + (n - x) \log (1 - π) .$

In many problems of interest, we will derive our loglikelihood from a sample rather than from a single observation. If we observe an independent sample

$\begin{array}{rcl} L (θ | x) & = & \prod_{i = 1}^{n} f (x_{i} | θ) \\ = & \prod_{i = 1}^{n} L (θ | x_{i}) \end{array}$

$\begin{array}{rcl} l (θ | x) & = & log \prod_{i = 1}^{n} f (x_{i} | θ) \\ = & \sum_{i = 1}^{n} log f (x_{i} | θ) = \sum_{i = 1}^{n} l (θ | x_{i}) . \end{array}$

In regular problems, as the total sample size
- it becomes more sharply peaked around its maximum, and
- its shape becomes nearly quadratic (i.e. a parabola, if there is a single parameter).
This is important since the tests such as Wald test based on

Transformations may help us to improve the shape of loglikelihood. More on this in Section 1.6 on Alternative Parametrizations. Next we will see how we use the likelihood, that is the corresponding loglikelihood, to estimate the most likely value of the unknown parameter of interest.

from: https://onlinecourses.science.psu.edu/stat504/node/27
查看全文

相关阅读:
JS设计模式（11）中介者模式
 PHP实现微信模板消息发送给指定用户
 PHP浮点精度问题
 PHP微信红包生成算法的程序源码（用抛物线的模型实现）
PHP队列的实现详细操作步骤
 PHP数组函数实现栈与队列的方法介绍（代码示例）
Laravel 事务中使用悲观锁
 array_reduce — 用回调函数迭代地将数组简化为单一的值
 PHP上传文件和下载
 php导出xls，报错：文件格式和扩展名不匹配。该文件可能已损坏或不安全。除非你相信它的来源，否则不要打开它。

原文地址：https://www.cnblogs.com/GarfieldEr007/p/5232140.html

似然和对数似然Likelihood & LogLikelihood

Loglikelihood