What are Individuals/subjects/objects?

They are units described by a set of data.

These units can be people, things, animals, companies, etc. They are actual nouns.

  • Examples of individuals/subjects/objects: Freshman students, companies, cells, etc.

What are variables?

They are characteristics of individuals/subjects/objects. They can either be qualitative or quantitative (categorical)

  • Quantitative: can be counted (number of things)
    • Ex: age (in years), length (in m), number of children, etc.
  • Qualitative: Are assigned to a specific category
    • Ex: religion (Christian, Muslim, Buddhist, etc.), gender (male, female), blood type (A, B, AB, O), whether you like somebody or not (yes, no), etc.

They take different values (whether they are numerical or not) for different units

  • Examples of variables: Gender, age, length, speed, height, religion, blood pressure, etc.

How many types do categorical variables have?

Categorical variables have two types:

  1. nominal - doesn’t have a specific order
  • Ex: Gender, religion, blood type, etc.
  1. ordinal - categories have a specific order/particular order (has a hierarchy)
  • Ex: Letter grade (A, B, C, D, F), year in college (Freshman, Sophmore, junior, senior), etc.

To summarize a categorical variable, we use a bar graph or a pie chart.

What are measures of location?

It measures the enter or either some other location.

  • Sample mean is denoted using $ \bar{x} $ (x bar)
  • Sample meadian is denoted using $ \tilde{x} $ (x tilde)

What are the three quartiles?

  1. 25th quartile
  2. 50th quartile
  3. 75th quartile

How to obtain sample mean of data x1 … xn?

It is given by the formula:

How to find the sample median?

You first must order the data in increasing order.

  • Ex: x(1), x(2), … x(n)

x(1) is the min and x(n) is the max

The formula to calculate the middle point ($ \tilde{x} $):

What is the trimmed mean?

It discards the a% smallest and a% largest observations (data). Then calculates the sample mean (average) of the remaining data.

  1. First quartile (25th quartile): Middle point of the first half of the data.
  2. Third quartile (75th quartile): Middle point of the second half of the data.
  3. Second quartile (50th quartile): the median ($ \tilde{x} $) of the data.

Meaning of Sample Variance

  1. $ \bar{x} $ is the sample mean
  2. $ x_i - \bar{x} $ is the sample deviation of the ith observation from the mean

What does MAD mean?

It stands for “Mean Absolute Deviation”