The Pearson Product-Moment Correlation is one of the measures of correlation which quantifies the strength as well as the direction of such relationship. It is usually denoted by the Greek letter ρ (rho).
In the study of relationships, two variables are said to be correlated [3] if change in one variable [4] is accompanied by change in the other - either in the same or opposite direction.
This coefficient is used only when two conditions are satisfied:
The coefficient (ρ) is calculated as the ratio of covariance between the variables to the product of their standard deviations [7]. This formulation is very useful for two key reasons.
First, it tells us the direction of relationship. Once the coefficient is computed, ρ > 0 will indicate a positive relationship, ρ < 0 will indicate negative relationship while ρ = 0 indicates non existence of any relationship.
Second, it ensures (mathematically) that the numerical value of ρ ranges from -1.0 to +1.0. This enables us to get an idea of the strength of relationship - or rather the strength of linear relationship [6] between the variables. The closer the coefficient is to +1.0 or -1.0, the greater the strength of the linear relationship.
As a rule of thumb, the following guidelines are often useful (though many experts would disagree somewhat on the choice of boundaries).
Value of ρ | Strength of relationship |
---|---|
-1.0 to -0.5 or 1.0 to 0.5 | Strong |
-0.5 to -0.3 or 0.3 to 0.5 | Moderate |
-0.3 to -0.1 or 0.1 to 0.3 | Weak |
-0.1 to 0.1 | None or very weak |
This measure of correlation has interesting properties:
It is independent of any units of measurement. For example, the ρ value between the highest day temperature (in Centigrade) and rainfall per day (in mm) is not expressed either in terms of centigrade or mm. This is because it is not expressing a quantity, but a relationship between quantities.
It is symmetric. This means that ρ between X and Y is exactly the same as ρ between Y and X.
Pearson's correlation coefficient is independent of change in origin and scale. This means that ρ between temperature (in Centigrade) and rainfall (in mm) would numerically be equal to ρ between temperature (in Fahrenheit) and rainfall (in cm).
If the variables are truly independent of each other, then one would obtain ρ = 0. However, the converse is not true. In other words ρ = 0 does not imply that the variables are independent - it only indicates the non existence of a non-linear relationship [8]. You may also arrive at this result in error if your variables are not in interval or ratio scale of measurement.
While ρ is a powerful tool, it is a much abused one and hence has to be handled carefully.
People often forget or gloss over the fact that ρ is a measure of linear relationship. Consequently a small value of ρ is often interpreted to mean non existence of relationship when actually it only indicates non existence of a linear relationship or at best a very weak linear relationship.
Under such circumstances it is possible that a (possibly strong!) non linear relationship exists.
It's best to construct a scatter diagram to reveal any non linear relationships before firmly concluding the non existence of a relationship. If the scatter diagram points to a non linear relationship, an appropriate transformation can often attain linearity in which case ρ can be recomputed.
One has to be careful in interpreting the value of ρ, specifically when it makes no obvious sense to connect the variables in the first place.
For example, one could compute ρ between shoe size and intelligence, or height and income. Irrespective of the value of ρ, such a correlation makes no sense and is hence termed chance or non-sense correlation.
As with many related statistics, ρ should not be used to make claims about a cause and effect relationship [9]. Put differently, by examining the value of ρ, we can only conclude that variables X and Y are related.
However the same value of ρ does not tell us if X influences Y or the other way round - a fact that is of key importance in regression analysis [10].
Links
[1] https://staging.explorable.com/en/pearson-product-moment-correlation
[2] https://staging.explorable.com/en
[3] https://staging.explorable.com/statistical-correlation
[4] https://staging.explorable.com/research-variables
[5] https://staging.explorable.com/measurement-scales
[6] https://staging.explorable.com/linear-relationship
[7] https://staging.explorable.com/calculate-standard-deviation
[8] https://staging.explorable.com/non-linear-relationship
[9] https://staging.explorable.com/cause-and-effect
[10] https://staging.explorable.com/linear-regression-analysis