Logic Table
P&Q→P
P→P∨Q
∼(P∨Q)↔∼P&∼Q
∼(P&Q)↔∼P∨∼Q
Set Theory
x∈A⊂Ω
AC={x∈Ω, x∈/A}
A∪B={x∈Ω, x∈A∨ x∈B}
A∩B={x∈Ω, x∈A& x∈B}
A⊂B↔∀x∈A,x∈B
A=B↔A⊂B,B⊂A
A−B=A∩BC
A∩B⊂A⊂A∪B
De Morgan's Law
(A∪B)C=AC∩BC
(A∩B)C=AC∪BC
Probability
x∈A⊂Ω∈α⊂2Ω
α is Sigma Alpha if and only if it is CUT
(Ω,α) is the measurable space.
P, α→[0,1] and CA (Countably Additive)
P(∪k=1∞Ak)=k=1∑∞P(Ak) if A1∩Aj=∅,∀i=j,P(Ω)=1
(P,α,Ω) is the probability space.
A and B are mutually exclusive.
A∩B=∅
A and B are independent
P(A∩B)=P(A)P(B)
P(A∪B)=P(A)+P(B)−P(A∩B)
Multiplication Theorem
P(∩k=1nAk)=P(A1)P(A2∣A1)⋯P(An∣A1∩A2⋯An−1)
if independent
P(∩k=1nAk)=k=1∏n(Ak)
P(A∣B)=P(B)P(A∩B)=indP(A)
Partition
Hk is a partition means
H1∩Hj=∅,∀i=j
∪k=1nAk=Ω
Total Probability
P(E)=k∑P(Hk)P(E∣Hk)
Bayes' Theorem
If Hk partitions Ω then
P(Hj∣E)=j∑P(E∣Hj)P(Hj)P(E∣Hk)P(Hk)
Binomial Theorem
(p+q)n=k=0∑n(kn)pkqn−k
j=1∑∞aj=1−aa,∣a∣<1
SN=j=1∑Naj
SN=a+a2+⋯+aN — ①
aSN=a2+⋯aN+1 — ②
If we subtract ② from ①, we get
SN=1−aa−an+1
k=1∑∞kak=(1−a)2a
n→∞lim(n1+x)n=ex
| Number of Outcomes | With Replacement | Without Replacements |
|---|
| 2 | Binomial (different when until*...) | Hypergeometric |
| ≥ 3 | Multinomial | Multivariate Hypergeometric |
until*
- 1st success → geometric
- rth success → negative binomial
Poisson Distribution
P(λ)=x!e−λλx
b→dp if n>>1, p<<1 and λ=np
Continuous
If X∼N(0,1), then Z=X2∼X2(1)
Beta(α, β) 0<x<1
Uniform (α, β) a<x<b
Gamma γ(α,β)
f(x)=Γ(α)θαxα−1eθ−x
Exponential(θ) = γ(α=1,theta)
Chi-squared(γ) = γ(α=2γ,θ=2)
X∼N(μ,σx2) → 2πσ1e2σx2−(x−μ)2
Y=g(x)
fy(y)=xk∑fx(xk)∣dydx∣@x=xk
Moments
E[aX+b]=aE[X]+b
V[aX+b]=a2VX
X∼γ(α,θ), E[Xk]=Γ(α)Γ(α+k)θk
Γ(α+1)=αΓ(α), Γ(1)=1, Γ(21)=π
Uncertainty Principle
σxy2≤σx2σy2
Covariance
σxy=E[XY]−E[X]E[Y] where E[XY] is the correlation.
ρxy=σxσyσxy −1≤ρxy≤1
Weak Law of Large Number
- Expectation of the Sample Mean: E[Xn]
- Variance of the Sample Mean: V[Xn]
Sample Mean θ^n is converging to Population Mean θ: n→∞limE[(θ^n−θ)2]=0
n→∞limV[θ^n]+(E[θ^n]−θ)2)=0
n→∞limnσx2+(μx−μx)2=0
θ^n→θ
∀ϵ>0, n→∞limP(∣θ^n−θ)∣>ϵ)=0
n→∞limP(∣xn−μx∣>ϵ)≤n→∞limnϵ2σx2
x≥0,c∈R+,E[X]<∞
P(X)≥C)≤CE[X]
σx2<∞
P(∣x−μx∣>ϵ)≤ϵ2σx2
Standardizing RV
Subtract the mean and divide by the standard deviation
Beta Binomial Conjugacy
Prior h(σ)∼Beta(α,β)
Likelihood g(x∣θ)∼Binomial(n,x,θ)
Posterior
f(θ∣x)=θ∫g(x∣θ)h(θ)dθg(x∣θ)h(θ)
∴f(θ∣x)∼Beta(α+x, β+n−x)
Maximum Likelihood Estimation
f(D∣θ)
θ^ML=argmaxθg(x∣θ)=argmaxθlng(x∣θ)
θ^ML=argmaxθg(x1, x2, ⋯, xn∣θ)
=argmaxθk=1∏ng(xk∣θ) — i.i.d. / r.s.
=argmaxθk=1∑nlng(xk∣θ)
∂θ∂L∣θ=θ^ML=0
∴Check∂θ∂L∣θ=θ^ML<0
h(θ)^ML=h(θ^ML)
x1,⋯,xn∼Geometric(P) σ2^ML
p1^=Xn⇒p^=x1
xn12σ2^textML=p2q=p21−p^ML=1−xn1
SIT Technique
Ex[X]=Ey[E[X∣Y]]
E[g(x, y, z)]=Ez[Ey∣z[Ex∣y,z[g(x, y, z)∣y, z]]]
Central Limit Theorem
If i.i.d. x1,⋯xn and σx2<∞ then:
std(xn)→dZ∼N(0, 1)
std(k=1∑nxn)→dZ∼N(0, 1)
std(xn)→nσxXn−μx
σxnk=1∑nXn−nμx
V[k=1∑nXk]=nσx2=n2×nσx2