Mean, Median, and Mode
Mean, median, and mode are all measures of central tendency.
- Mode refers to the most frequently occurring number in a dataset,
- Median is the middle number of an ordered dataset, and
- Mean is calculated by piding the sum of numbers by the number of numbers
Variance and Standard Deviation
Variance and standard deviation are measures of spread in a dataset. They measure how far the points deviate from the mean on average.
Variance is calculated using the following steps:
- Calculate difference between each point and the mean
- Square the differences
- Sum the squares
- Divide the sum by the number of numbers
Standard deviation is the square root of variance.
Example
Let’s illustrate with an example. Suppose we have the following dataset:
1,1, 2, 2, 3, 3, 3, 3, 4, 4, 5
Median
There are 11 numbers in the dataset.
Therefore, median
= 6th number
= 3
Mode
Mode = 3
Mean
Mean
= (1 + 1 + 2 + 2 + 3 + 3 + 3 + 3 + 4 + 4 + 5) / 11
= 31 / 11
= 2.82
Variance
Sum of squared differences (Steps 1 to 3 above)
= (1 – mean)2 + (1 – mean)2 + (2 – mean)2 + …. + (4 – mean)2 + (5 – mean)2
= 172/11
Variance
= sum of squared differences / number of numbers
= (172/11) / (11)
= 1.42
Standard Deviation
Standard Deviation
= Square root of variance
= 1.19
Video
For a video tutorial, check out this excellent YouTube video. Note that the video uses a slightly different formula for variance. It uses the standard deviation for a sample, while in most cases, we use the standard deviation for a population.
It does not make much of a difference when your sample size is large, so either one is fine.