Supervised Learning continued…
- x $ {\in} $ X (capital X is the input space)
- y $ {\in} $ Y (capital Y is the output space)
Basically, the input space is the set of possible inputs; output space is the set of possible outputs.
- D = {(${x_n}, {y_n}$)}${_n}^{N}$ is the set of all input-output pairs.
N is the dataset size and D is script D.
REMEMBER: there are two types of learning models:
- Classification - Supervised ML model where outputs are categorical where ${y_n}$ is categorical. (ex: classes 0-9 for handwritten digits)
EX: Classifying hand-written digits
- Regression
- Example of model parameters: $ f(x) = a{x^2} + bx + c$ where ${a, b, c}$ are parameters and ${x}$ is the input if ${a = 3, b = 5, c = -4}$ and ${x = 5}$: ${f(5) = 75 + 25 - 4 = 96}$
Example w/ 3-dimensional input:
enter notes here
What is classification?
It is a supervised learning model where the outputs are categorical (${y_n}$ is categorical)
- EX: Hand-written digits (MNIST dataset) where the dataset has 10 different classes (0-9).
NOTE: input for MNIST is a matrix (28 * 28) matrix of pixels (black and white) where each position is heavily accounted for to predict different classes (0-9).
- generalization = how well the model performs with data that it hasn’t used before.
MOST IMPORTANT NOTE: A model’s performance with data that it hasn’t used before will determine whether it is good or not (performance measure).