Statistical variance gives a measure of how the data distributes itself about the mean or expected value. Unlike range that only looks at the extremes, the variance looks at all the data points and then determines their distribution.
In many cases of statistics and experimentation, it is the variance that gives invaluable information about the data distribution.
The mathematical formula to calculate the variance is given by:
σ2 = variance
∑ (X - µ)2 = The sum of (X - µ)2 for all datapoints
X = individual data points
µ = mean of the population
N = number of data points
This means the square of the variance is given by the average of the squares of difference between the data points and the mean.
For example, suppose you want to find the variance of scores on a test. Suppose the scores are 67, 72, 85, 93 and 98.
First, write down the formula for variance:
σ2 = ∑ (x - µ)2 / N
Next, there are five scores in total, so N = 5.
σ2 = ∑ (x - µ)2 / 5
Calculate the mean (µ) for the five scores: 67 + 72 + 85 + 93 + 98 / 5, so µ = 83.
σ2 = ∑ (x - 83)2 / 5
Now, compare each score (x = 67, 72, 85, 93, 98) to the mean (µ = 83)
σ2 = [ (67 - 83)2+(72 - 83)2+(85 - 83)2+(93 - 83)2+(98 - 83)2 ] / 5
Conduct the subtraction in each paranthesis.
67 - 83 = -16
72 - 83 = -11
85 - 83 = 2
93 - 83 = 10
98 - 83 = 15
The formula will now look like this:
σ2 = [ (-16)2+(-11)2+(2)2+(10)2+(15)2] / 5
Then, square each paranthesis. We get 256, 121, 4, 100 and 225.
This is how:
σ2 = [ (-16)x(-16)+(-11)x(-11)+(2)x(2)+(10)x(10)+(15)x(15)] / 5
σ2 = [16x16 + 11x11 + 2x2 + 10x10 + 15x15] / 5
which equals:
σ2 = [256 + 121 + 4 + 100 + 225] / 5
7Then summarize the numbers inside the brackets:
σ2 = 706 / 5
To get the final answer, we divide the sum by 5 (Because it was a total of five scores). This is the final variance for the dataset:
σ2 = 141.2
This is the variance of the population of scores.
In many cases, instead of a population, we deal with samples.
In this case, we need to slightly change the formula for variance to:
S2 = the variance of the sample.
Note that the denominator is one less than the sample size in this case.
The concept of variance can be extended to continuous data sets too. In that case, instead of summing up the individual differences from the mean, we need to integrate them. This approach is also useful when the number of data points is very large, for example the population of a country.
Variance is extensively used in probability theory, where from a given smaller sample set, more generalized conclusions need to be drawn. This is because variance gives us an idea about the distribution of data around the mean, and thus from this distribution, we can work out where we can expect an unknown data point.
Oskar Blakstad (Mar 15, 2009). Statistical Variance. Retrieved Oct 04, 2023 from Assisted Self-Help: https://staging.explorable.com/en/statistical-variance