Skip to main content

Probability Review

Logic Table

P&QPP \& Q \rightarrow P

PPQP \rightarrow P \vee Q

(PQ)P&Q\sim (P \vee Q) \leftrightarrow \sim P \& \sim Q

(P&Q)PQ\sim (P \& Q) \leftrightarrow \sim P \vee \sim Q

Set Theory

xAΩx \in A \subset \Omega

AC={xΩ, xA}A^C = \{x \in \Omega, ~ x \notin A \}

AB={xΩ, xA xB}A \cup B = \{x \in \Omega, ~ x \in A \vee ~ x \in B\}

AB={xΩ, xA& xB}A \cap B = \{x \in \Omega, ~ x \in A \& ~ x \in B\}

ABxA,xBA \subset B \leftrightarrow \forall x \in A, x \in B

A=BAB,BAA = B \leftrightarrow A \subset B, B \subset A

AB=ABCA - B = A \cap B^C

ABAABA \cap B \subset A \subset A \cup B

De Morgan's Law

(AB)C=ACBC{(A \cup B)}^C = A^C \cap B^C

(AB)C=ACBC{(A \cap B)}^C = A^C \cup B^C

Probability

xAΩα2Ωx \in A \subset \Omega \in \alpha \subset 2^{\Omega}

α\alpha is Sigma Alpha if and only if it is CUT

(Ω,α)(\Omega, \alpha) is the measurable space.

PP, α[0,1]\alpha \rightarrow [0, 1] and CA (Countably Additive)

P(k=1Ak)=k=1P(Ak)P(\cup_{k=1}^{\infty} A_k) = \sum\limits_{k=1}^{\infty}P(A_k) if A1Aj=,ij,P(Ω)=1A_1 \cap A_j = \emptyset, \forall i \neq j, P(\Omega) = 1

(P,α,Ω)(P, \alpha, \Omega) is the probability space.

AA and BB are mutually exclusive.

AB=A \cap B = \emptyset

AA and BB are independent

P(AB)=P(A)P(B)P(A \cap B) = P(A) P(B)

P(AB)=P(A)+P(B)P(AB)P (A \cup B) = P (A) + P(B) - P (A \cap B)

Multiplication Theorem

P(k=1nAk)=P(A1)P(A2A1)P(AnA1A2An1)P (\cap_{k=1}^{n} A_k) = P(A_1) P(A_2 | A_1) \cdots P(A_n | A_1 \cap A_2 \cdots A_{n-1})

if independent

P(k=1nAk)=k=1n(Ak)P (\cap_{k=1}^{n} A_k) = \prod\limits_{k=1}^n (A_k)

P(AB)=P(AB)P(B)=indP(A)P(A|B) = {P(A \cap B) \over P(B)} {=^{\text{ind}}} P(A)

Partition

Hk{H_k} is a partition means

H1Hj=,ijH_1 \cap H_j = \emptyset, \forall i \neq j

k=1nAk=Ω\cup_{k=1}^{n} A_k = \Omega

Total Probability

P(E)=kP(Hk)P(EHk)P(E) = \sum\limits_k P(H_k) P(E | H_k)

Bayes' Theorem

If Hk{H_k} partitions Ω\Omega then

P(HjE)=P(EHk)P(Hk)jP(EHj)P(Hj)P(H_j | E) = {{P(E|H_k) P(H_k)} \over {\sum\limits_{j} P(E|H_j) P(H_j)}}

Binomial Theorem

(p+q)n=k=0n(nk)pkqnk(p + q)^n = \sum\limits_{k=0}^n {n \choose k} p^k q^{n-k}

j=1aj=a1a,a<1\sum\limits_{j=1}^{\infty} a^j = {a \over {1-a}}, |a| < 1

SN=j=1NajS_N = \sum\limits_{j=1}^N a^j

SN=a+a2++aNS_N = a + a^2 + \cdots + a^N — ①

aSN=a2+aN+1aS_N = a^2 + \cdots a^{N+1} — ②

If we subtract ② from ①, we get

SN=aan+11aS_N = {{a - a^{n+1}} \over {1-a}}

k=1kak=a(1a)2\sum\limits_{k=1}^{\infty} k a^k = {a \over (1-a)^2}

limn(1+xn)n=ex\lim\limits_{n \to \infty} ({1 + x \over n})^n = e^x

Number of OutcomesWith ReplacementWithout Replacements
2Binomial (different when until*\text{until}^\text{*}...)Hypergeometric
\geq 3MultinomialMultivariate Hypergeometric

until*\text{until}^\text{*}

  • 1st1^{\text{st}} success → geometric
  • rthr^{\text{th}} success → negative binomial

Poisson Distribution

P(λ)=eλλxx!P (\lambda) = {{e^{-\lambda} \lambda^x} \over x!}

bdpb \rightarrow^d p if n>>1n >> 1, p<<1p << 1 and λ=np\lambda = np

Continuous

If XN(0,1)X \sim N(0,1), then Z=X2X2(1)\mathbb{Z} = X^2 \sim \mathcal{X}^2(1)

Beta(α\alpha, β\beta) 0<x<10 < x < 1

Uniform (α\alpha, β\beta) a<x<ba < x < b

Gamma γ(α,β)\gamma(\alpha, \beta)

f(x)=xα1Γ(α)θαexθf(x) = {x^{\alpha - 1} \over \Gamma(\alpha) \theta^\alpha} e^{-x \over \theta}

Exponential(θ\theta) = γ(α=1,theta)\gamma(\alpha = 1, theta)

Chi-squared(γ\gamma) = γ(α=γ2,θ=2)\gamma(\alpha = {\gamma \over 2}, \theta = 2)

XN(μ,σx2)\mathcal{X} \sim N(\mu, \sigma_x^2)12πσe(xμ)22σx2{1 \over \sqrt{2 \pi} \sigma} e^{-{(x-\mu)^2} \over {2\sigma_x^2}}

Y=g(x)Y=g(x)

fy(y)=xkfx(xk)dxdy@x=xkf_y(y) = \sum\limits_{x_k} f_x(x_k) |{dx \over dy}|_{\text{@} x = x_k}

Moments

E[aX+b]=aE[X]+b\mathbb{E}[aX+b] = a\mathbb{E}[X] + b

V[aX+b]=a2VX\mathbb{V}[aX+b] = a^2 \mathbb{V}{X}

Xγ(α,θ)X \sim \gamma(\alpha, \theta), E[Xk]=Γ(α+k)Γ(α)θk\mathbb{E}[X^k] = {\Gamma(\alpha+k) \over \Gamma(\alpha)} \theta^k

Γ(α+1)=αΓ(α)\Gamma(\alpha+1) = \alpha \Gamma(\alpha), Γ(1)=1, Γ(12)=π\Gamma(1) = 1,~\Gamma({1 \over 2}) = \sqrt{\pi}

Uncertainty Principle

σxy2σx2σy2\sigma_{xy}^2 \leq \sigma_{x}^2 \sigma_{y}^2

Covariance

σxy=E[XY]E[X]E[Y]\sigma_xy = \mathbb{E}[XY] - \mathbb{E}[X] \mathbb{E}[Y] where E[XY]\mathbb{E}[XY] is the correlation.

ρxy=σxyσxσy\rho_{xy} = {\sigma_{xy} \over {\sigma_x \sigma_y}} 1ρxy1-1 \leq \rho_{xy} \leq 1

Weak Law of Large Number

  • Expectation of the Sample Mean: E[Xn]\mathbb{E}[\overline{X^n}]
  • Variance of the Sample Mean: V[Xn]\mathbb{V}[\overline{X^n}]

Sample Mean θ^n\hat{\theta}_n is converging to Population Mean θ\theta: limnE[(θ^nθ)2]=0\lim\limits_{n \to \infty} \mathbb{E}[(\hat{\theta}_n - \theta)^2] = 0

limnV[θ^n]+(E[θ^n]θ)2)=0\lim\limits_{n \to \infty} \mathbb{V}[\hat{\theta}_n] + (\mathbb{E}[\hat{\theta}_n] - \theta)^2) = 0

limnσx2n+(μxμx)2=0\lim\limits_{n\to\infty} {\sigma_x^2 \over n} + (\mu_x - \mu_x)^2 = 0

θ^nθ\hat{\theta}_n \to \theta

ϵ>0\forall \epsilon > 0, limnP(θ^nθ)>ϵ)=0\lim\limits_{n \to \infty} P(|\hat{\theta}_n - \theta)| > \epsilon) = 0

limnP(xnμx>ϵ)limnσx2nϵ2\lim\limits_{n \to \infty} P(|\overline{x_n} - \mu_x| > \epsilon) \leq \lim\limits_{n \to \infty} {\sigma_x^2 \over {n \epsilon^2}}

MI

x0,cR+,E[X]<x \geq 0, c \in \mathbb{R}^+, \mathbb{E}[X] < \infty

P(X)C)E[X]CP(X) \geq C) \leq {\mathbb{E}[X] \over C}

CI

σx2<\sigma_x^2 < \infty

P(xμx>ϵ)σx2ϵ2P(|x - \mu_x| > \epsilon) \leq {\sigma_x^2 \over \epsilon^2}

Standardizing RV

Subtract the mean and divide by the standard deviation

Beta Binomial Conjugacy

Prior h(σ)Beta(α,β)h(\sigma) \sim \text{Beta}(\alpha, \beta)

Likelihood g(xθ)Binomial(n,x,θ)g(x|\theta) \sim \text{Binomial}(n, x, \theta)

Posterior

f(θx)=g(xθ)h(θ)θg(xθ)h(θ)dθf(\theta | x) = {{g(x | \theta) h(\theta)} \over {\int\limits_{\theta} g(x|\theta) h(\theta) d \theta}}

f(θx)Beta(α+x, β+nx)\therefore f(\theta | x) \sim \text{Beta}(\alpha + x,~\beta + n - x)

Maximum Likelihood Estimation

f(Dθ)f(\mathbb{D} | \theta)

θ^ML=argmaxθg(xθ)=argmaxθlng(xθ)\hat{\theta}_\text{ML} = \text{argmax}_{\theta} g(x|\theta) = \text{argmax}_{\theta} \ln g(x | \theta)

θ^ML=argmaxθg(x1, x2, , xnθ)\hat{\theta}_\text{ML} = \text{argmax}_{\theta} g(x_1,~x_2,~\cdots,~x_n | \theta) =argmaxθk=1ng(xkθ)= \text{argmax}_{\theta} \prod\limits_{k=1}^{n} g(x_k|\theta) — i.i.d. / r.s. =argmaxθk=1nlng(xkθ)= \text{argmax}_{\theta} \sum\limits_{k=1}^{n} ln g(x_k|\theta)

Lθθ=θ^ML=0{\partial L \over \partial \theta} |_{\theta = \hat\theta_\text{ML}} = 0

CheckLθθ=θ^ML<0\therefore \text{Check} {\partial L \over \partial \theta} |_{\theta = \hat\theta_\text{ML}} < 0

h(θ)^ML=h(θ^ML)\hat {h(\theta)}^\text{ML} = h(\hat\theta^\text{ML})

x1,,xnGeometric(P)x_1, \cdots, x_n \sim \text{Geometric} (P) σ2^ML\hat{\sigma^2}^\text{ML}

1p^=Xnp^=1x\hat{1 \over p} = \overline{X_n} \Rightarrow \hat{p} = {1 \over \overline{x}}

σ2^textML=qp2=1pp2^ML=11xn1xn2{\hat{\sigma^2}}^text{ML} = {q \over p^2} = {\hat{{1-p} \over p^2}}^\text{ML} = 1 - {1 \over \overline{x_n}} \over {{1 \over \overline{x_n}}^2}

SIT Technique

Ex[X]=Ey[E[XY]]\mathbb{E}_x[X] = \mathbb{E}_y [\mathbb{E}[X|Y]]

E[g(x, y, z)]=Ez[Eyz[Exy,z[g(x, y, z)y, z]]]\mathbb{E}[g(x,~y,~z)] = \mathbb{E}_z [\mathbb{E}_{y|z}[ \mathbb{E}_{x|y,z}[g(x,~y,~z)|y,~z]]]

Central Limit Theorem

If i.i.d. x1,xnx_1, \cdots x_n and σx2<\sigma_x^2 < \infty then:

std(xn)dZN(0, 1)\text{std}(\overline{x_n}) \rightarrow^d \mathbb{Z} \sim \mathbb{N}(0,~1)

std(k=1nxn)dZN(0, 1)\text{std}(\sum\limits_{k=1}^n{x_n}) \rightarrow^d \mathbb{Z} \sim \mathbb{N}(0,~1)

std(xn)Xnμxσxn\text{std}(\overline{x_n}) \rightarrow {{\overline{X_n} - \mu_x} \over {\sigma_x \over \sqrt{n}}}

k=1nXnnμxσxn{\sum\limits_{k=1}^n X_n - n\mu_x} \over \sigma_x \sqrt{n}

V[k=1nXk]=nσx2=n2×σx2n\mathbb{V}[\sum\limits_{k=1}^n X_k] = n \sigma_x^2 = n^2 \times {\sigma_x^2 \over n}

Links to This Note