Bernoulli Distribution

Used to represent the distribution of a binary outcome. For example, tossing a coin.

  1. y = 1 <- ‘heads’ $P(y=1) = \theta$
  2. y = 0 <- ‘tails’ $P(y=0) = 1 - \theta$

PMF is defined as: $P(y | \theta) = \theta$ if $y = 1$ $P(y | \theta) = 1 - \theta$ if $y = 0$

Binomial Distributions

Used to represent the distribution of a repeated binary outcome.

Bernoulli is a special case of a binomial distribution.

PMF will be defined as: $P(S | N, \theta) = \frac{N!}{S!(N-S)!}\theta^s(1-\theta)^{N-S}$

Categorical Distribution (aka Multinulli)

Generalizes the Bernoulli distribution to more than two possible outcomes. Ex: Like rolling a C-sided die, where C > 2.

Multinomial Distribution

Generalizes the categorical distribution to N > 1. Suppose we roll a C-sided die N times, S represent C-dimensional vector that keeps track of how many times each side comes up. What is the probability of getting a particular vector S?

  • The distribution of S is given by the multinomial distribution S ~ $Mu(s | N, \theta) = \frac{N!}{S_1!S_2!…S_C!} * (\theta_1)^{S_1}…*(\theta)^{S_c}$

NOTE: Theta here is a vector.

Gaussian Distribution (aka Normal)

It is the most commonly used distribution in statistics and machine learning. Its Probability Density Function (PDF) is given by:

$\mathcal{N}(y|\mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}} * e^{-\frac{1}{2\sigma^2}(y-\mu)^2}$

Where:

  1. $\mu$ = Mean
  2. $\sigma^2$ = Variance of the distribution
  3. $\sqrt{2\pi\sigma^2}$ = the normalization constant that ensures the density integrates to 1
  • SPECIAL CASE: If $\mu = 0$, $\sigma^2 = 1$ (where std dev = 1), it is the Standard Gaussian Distribution The more variant in the distribution = more wide the curve will be. Narrowness indicates data will reside in a range of numbers.

Don’t need to memorize the formula; however, you need to know how it works.

Multiveriate Gaussian Distribution

Covariance

It measures the degree to which two random variables are linearly related. Cov[x,y] = ${\mathbb{E}[(x-{\mathbb{E}[x]}) * (y-{\mathbb{E}[y]})]}$ $= {\mathbb{E}[xy]-\mathbb{E}[x]*\mathbb{E}[y]}$

Correlation

Normalized measure of covariance.