What is Unsupervised Learning?

It is a learning model where it is trained ONLY on the inputs.

REMEMBER: Supervised learning uses input-output pairs to parameterize the model for improvement.

  • Ex of Unsupervised Learning: Music recommendations by using clusters of users that have similar music taste.

What is Clustering?

No classification exists to determine data clusters. Therefore, it is a challenge to separate data in their respective classes/groups.

Differences between CLustering and Classification:

Clustering Classification
No Labels (Unsupervised) Labels/Classes (Supervised)
Determine # of Clusters is part of task Training data to determine # of classes.

Dimensionality Reduction w/ Principal Compinents Analysis (PCA):

Linearly transforms input features by removing the correlated features w/ the objection of maximizing the variance

OG Dataset ->PCA-> Dataset represented w/ components -> Choose the PCs that explain a certain variability in the data -> Apply the ML Algo

  • PCA is a “feature extraction” method, because the original input features are transformed and not included.
  • A “feature selection” method selects a subset of input features that explain a certain percentage of variability.
  • The input features that are not selected are eliminated (feature elimination).

REMEMBER: A lot of input features does not necessarily mean they are useful!

Input/Output Types

  1. Binary (Categorical w/ two possible values)
  2. Categorical (W/ more than 2 possible values)
  3. Real-valued (continuous)

Examples of each type:

  • Real-valued: 7, 9, -3.3, 10001, …
  • Binary: {0,1}, {yes,no}, {pass,fail}, …
  • Categorical: {‘Panda’, ‘Cat’, ‘Lion’}, {‘Sedan’, ‘Sport’, ‘Truck’}, …

We cannot use single integers to represent classes because the numbers have a natural order.

Two Interpretations

  1. Frequentist Probabilities represent long run frequencies of events.
  • Ex: If we flip a coin many times, we expect half of those flips to land on heads.
  1. Bayesian Probability is used to quantify our uncertainty about something.
  • Ex: Probability a coin lands on heads is 0.5. We can also believe that it will equally likely to land on heads or tails on the next toss.