Entropy
It is the measure of uncertainty in a probability distribution.
Formula: $\mathbb{H}(x) = -\sum_k^kP(X=k)*logp(X=k)$
Gross Entropy
Gross Entropy of two distributions P and Q is given by: $\mathbb{H}(p,q) = -\sum_{k=1}^kP_k*log(q_k)$
Gross Entropy is used as log loss in binary classification.
Generate Classifier
$P(y = c|x, \theta) \alpha P(x|y = c, \theta)P(y=c | \theta)$
NOTE: Alpha means proportional
Difference between Discrimitive and Generative Classifiers
- Discrimitive Classifier finds a boundry that separates the two classes.
- Generative Classifier considers the data in each class and compute the probability of a test instance belonging to that class.
Formula: $P(y = c|x, \theta) = \frac{P(x|y=c, \theta) * P(y=c|\theta)}{\sum_{c'}P(x|y=c', \theta)*P(y=c'|\theta)}$
- $P(y=c|x, \theta) = $ Posterior Probability
- $P(x|y=c, \theta) = $ Class-conditional Density
- $P(y = c|\theta) = $ Prior Probability
Naive Bayes Classifier (NBC)
It’s a generative classifier that assume the input features are conditionally independent within each class. Then the class conditional density becomes: $P(x|y=c, \theta) = \prod_{d=1}^DP(x_d | y =c,\theta)$
Maximum Likelihood Estimate (MLE) for Naive-based Classifiers
MLE for class prior is given by: $\prod_c = \frac{N_c}{N}$
- $N_c =$ number of examples in Class C
- $N =$ total number of examples in dataset