Entropy

It is the measure of uncertainty in a probability distribution.

Formula: $\mathbb{H}(x) = -\sum_k^kP(X=k)*logp(X=k)$

Gross Entropy

Gross Entropy of two distributions P and Q is given by: $\mathbb{H}(p,q) = -\sum_{k=1}^kP_k*log(q_k)$

Gross Entropy is used as log loss in binary classification.

Generate Classifier

$P(y = c|x, \theta) \alpha P(x|y = c, \theta)P(y=c | \theta)$

NOTE: Alpha means proportional

Difference between Discrimitive and Generative Classifiers

  • Discrimitive Classifier finds a boundry that separates the two classes.
  • Generative Classifier considers the data in each class and compute the probability of a test instance belonging to that class.

Formula: $P(y = c|x, \theta) = \frac{P(x|y=c, \theta) * P(y=c|\theta)}{\sum_{c'}P(x|y=c', \theta)*P(y=c'|\theta)}$

  1. $P(y=c|x, \theta) = $ Posterior Probability
  2. $P(x|y=c, \theta) = $ Class-conditional Density
  3. $P(y = c|\theta) = $ Prior Probability

Naive Bayes Classifier (NBC)

It’s a generative classifier that assume the input features are conditionally independent within each class. Then the class conditional density becomes: $P(x|y=c, \theta) = \prod_{d=1}^DP(x_d | y =c,\theta)$

Maximum Likelihood Estimate (MLE) for Naive-based Classifiers

MLE for class prior is given by: $\prod_c = \frac{N_c}{N}$

  1. $N_c =$ number of examples in Class C
  2. $N =$ total number of examples in dataset