统计中的方程

概率中的方程


基本公式

P(AB)=P(AB)P(A\cap B) = P(AB)

P(AB)=P(A)+P(B)P(AB)P(A\cup B) = P(A) + P(B) - P(AB)

AB互斥,A、B互斥,

P(AB)=0P(A+B)=P(A)+P(B)P(AB) = 0 \\ P(A+B) = P(A) + P(B)

AB独立,A、B独立,

P(AB)=P(A)×P(B)P(AB) = P(A)\times P(B)

A1,A2,Ak相互独立,A_1,A_2, \dots A_k相互独立,

P(A1A2Ak)=i=1kP(Ai)P(A_1A_2\dots A_k) = \prod_{i=1}^kP(A_i)

条件概率

P(BA)=P(AB)P(A)P(B|A) = \frac{P(AB)}{P(A)}

P(B)=i=1n(P(BAi))P(B) = \sum_{i=1}^n\Big(P(B|A_i) \Big)

贝叶斯公式

P(AB)=P(AB)P(B)=P(A)P(BA)P(B)P(A|B) = \frac{P(AB)}{P(B)} = \frac{P(A)P(B|A)}{P(B)}

累计概率

F(x)=P(xX)=xf(t)dt,<x<F(x)= P(x\le X) = \int_{-\infty}^x f(t)dt, -\infty < x< \infty

样本与估计

期望

E(X)=i=1nxipi=xf(x)dxE(X) = \sum_{i=1}^{n}x_ip_i = \int_{-\infty}^{\infty}xf(x)dx

E(X)  =undefinedi=1nxi=xˉE(X)~ ~ \widehat{=}\sum_{i=1}^nx_i=\bar{x}

E(x)=xf(x;θ)dxE(x)=\int_{-\infty}^{\infty}xf(x;\theta)dx

方差

D(X)=E(X2)E(X)2D(X) = E(X^2) - E(X)^2

σ(X)=E(D(X))=i=1n(xiE(X))2pi\sigma(X) = E\Big(D(X)\Big)= \sum_{i=1}^n\Big( x_i - E(X)\Big)^2p_i

COV(X,Y)COV(X,Y)

三阶矩

似然函数与最大似然估计

P(x)P(x;θ)P(x) \to P(x;\theta)

L(θ)={i=1nP(xi;θ) i=1nf(xi;θ)L(\theta)= \begin{cases} \prod_{i=1}^{n}P(x_i;\theta) \\ ~ \\ \prod_{i=1}^{n}f(x_i;\theta) \end{cases}

L(θ^)=maxθL(x;θ)L(\hat{\theta}) = \underset{\theta}{max} L(x;\theta)

估计的性质

无偏

E(θ^)=E(θ)    {E(μ^)=μ E(σ^2)=σ2 E(\hat{\theta}) = E(\theta) \implies \begin{cases} E(\hat{\mu}) = \mu \\ ~ \\ E(\hat{\sigma}^2) = \sigma^2 \end{cases}

有效

Var(θ1^)Var(θ2^)Var(\hat{\theta_1}) \le Var(\hat{\theta_2})

切比雪夫不等式

P{Xμϵ}σ2ϵ2P\Big\{X - \mu \ge \epsilon\Big\} \le \frac{\sigma^2}{\epsilon^2}

大数定律

ϵ>0,limnP{1ni=1nxiμ<ϵ}=1\forall \epsilon > 0, \\ \lim_{n\to\infty}P \Big\{ |\frac {1}{n} \sum_{i = 1}^{n} x_i-\mu| < \epsilon \Big\} = 1

中心极限定理(误差定理)

X1,X2,,XnΩ,Xˉ N(μ,σ2n)X_1,X_2,\cdots,X_n \in \Omega, \\ \bar{X}\ \thicksim N(\mu,\frac{\sigma^2}{n})

常见分布


伯努利分布

P(X=1)=p,    P(X=0)=(1p)P(X=1) = p, ~~~~ P(X=0)= (1-p)

二项分布

P(k)=Cnkpk(1p)(nk)P(k) = C_n^k p^k (1-p)^{(n-k)}

几何分布

P(r)=(1p)r1pP(r) = (1-p)^{r-1}p

泊松分布

XPo(λ)  P(X=r;λ)=eλλrr!X \sim Po(\lambda) ~~ \\ P(X=r;\lambda) = e^{-\lambda} \frac{\lambda^r} {r!}

指数分布

f(x;λ)={λeλx, x>0 0,elsef(x;\lambda) = \begin{cases} \lambda e^{-\lambda x}, & ~x>0 \\ ~ \\ 0, & \text{else} \end{cases}

均匀分布

f(x)={1ba,ifaxb 0,elsef(x) = \begin{cases} \frac 1 {b-a}, & \text{if} & a\le x \le b\\ ~ \\ 0, & \text{else} \end{cases}

正态分布

幂律分布

数据分析


期望估计

  1. 平均值
  2. 中位值
  3. 众值
  4. 数据测度

数据抽样

  1. 等概率抽样
  2. 不等概率抽样

Exp[f(x)]=f(x)p(x)=f(x)q(x)p(x)q(x) =Exq[f(x)p(x)q(x)]E_{x \sim p}[f(x)] = \int f(x)p(x) = \int f(x)q(x) \frac {p(x)}{q(x)} \\ ~ = E_{x \sim q}[f(x) \frac {p(x)} {q(x)}]