Introduction:
A rank correlation is a statistical method to quantify a countable relationship between variables. The relationship can be between different arithmetical variable rankings or different rankings of the very same attribute, where rankings refer to the process of assigning identifiers as first, second, third, etc., to distinct observations of a selected variable.
A rank correlation coefficient calculates the similarity between two ranking lists and can determine the importance of the relationship between them. When the data is not obtainable in the form of a number or statistical values, but the information is adequate to rank & classify the data, the Correlation Rank method is useful. The well-known rank correlation statistics technique is Spearman Rank Correlation.
Rank correlation statistics:
Some of the more well-known rank correlation statistics are as follows:
- Spearman’s ρ: Spearman’s rank-order correlation is the nonparametric model of the Pearson product-moment correlation. Spearman’s correlation coefficient assesses the strength and significance of the relationship between two ranking factors.
- Kendall’s: Kendall’s method is for determining the arithmetical relationship between two units of measure. Kendall’s test is a nonparametric statistical technique based on the coefficient for statistical dependence.
- Goodman and Kruskal’s 𝛄: Goodman and Kruskal’s gamma (𝛄) is a statistical method that measures rank correlation, or the correlation of data orderings when ranked by several quantities. When both variables are measured at the ordinal level, it measures the degree of association of the cross-calculated results.
- Somers’ D: Somers’ D measures the quantitative relationship between two potentially dependent unknown variables, X and Y. Somers’ D considers values ranging from when all pairs of variations disagree to when all pairs of variations agree.
Spearman Rank Correlation:
To use Spearman Rank Correlation methods, you’ll need two arithmetical, time frame, or proportion variables. Although you would normally use the Pearson product-moment correlation on interval or ratio data, the Spearman correlation can be used when the assumptions of the Pearson correlation are fundamentally flawed. Spearman’s correlation, on the other hand, determines the direction and the strength of a monotonic correlation between the two variables rather than the direction and strength of a linear correlation between two paired data, as Pearson’s correlation does.
A monotonic relationship is one in which:
- Because the cost of 1 variable increases, so does the cost of the opposite variable; or
- Because the cost of 1 variable increases, the opposite variable cost decreases.
Importance of monotonic relationship to Spearman Rank correlation:
Spearman’s correlation does not require a monotonic relationship. That is, you can use Spearman’s correlation to see if there is a monotonic component to a non-monotonic relationship. One will usually choose Spearman’s correlation method as it is the best fit for the observed data pattern. When the scatterplot reveals the monotonic relationship between two variables, Spearman’s correlation is used to determine the correlation of this monotonic relationship.
On the other hand, if the scatter plot reveals a linear relationship between two variables, Pearson’s correlation is used to determine the correlation of this linear relationship. As it is not possible to keep checking if monotonic relationships are being determined, it is better to run Spearman’s correlation method.
Rank Correlation Formula:
The formula for the Spearman rank correlation coefficient is:
ρ = 1-6∑di²n(n²-1)
ρ = Spearman’s rank correlation coefficient
di = difference between the two ranks of each observation
n = number of observations
Let’s understand the Rank Correlation Formula with an example.
Example: The marks for nine students in Biology and Statistics are as follows:
Biology: 36, 24, 48, 18, 11, 44, 10, 7, 29
Statistics: 31, 32, 46, 24, 9, 50, 13, 5, 33
Calculate the student’s ranks in Biology and Statistics using the Spearman rank correlation.
Solution:
Step 1: Determine the rankings for each subject. To determine the rankings, use the Excel rank function. If you just want to rank by hand, arrange the scores from highest to lowest; designate the rank 1 to the top total, 2 to the next greatest, and so on:
Biology | Rank | Statistics | Rank |
36 | 3 | 31 | 5 |
24 | 5 | 32 | 4 |
48 | 1 | 46 | 2 |
18 | 6 | 24 | 6 |
11 | 7 | 9 | 8 |
44 | 2 | 50 | 1 |
10 | 8 | 15 | 7 |
7 | 9 | 5 | 9 |
29 | 4 | 33 | 3 |
Step 2: To your data, add a third column, d The d computes the change in rank. For example, if the first student has a Biology rank of 3 and a Statistics rank of 5, the difference is 2 points. Square your d values in a fourth column.
Biology | Rank | Statistics | Rank | d Value | d squared |
36 | 3 | 31 | 5 | 2 | 4 |
24 | 5 | 32 | 4 | 1 | 1 |
48 | 1 | 46 | 2 | 1 | 1 |
18 | 6 | 24 | 6 | 0 | 0 |
11 | 7 | 9 | 8 | 1 | 1 |
44 | 2 | 50 | 1 | 1 | 1 |
10 | 8 | 15 | 7 | 1 | 1 |
7 | 9 | 5 | 9 | 0 | 0 |
29 | 4 | 33 | 3 | 1 | 1 |
Step 3: Sum up all the ⅆ squared values.
= 4 + 1 + 1 + 0 + 1 + 1 + 1 + 0 + 1 = 10.
This is required for the formula ∑d². It is the sum of d squared values.
Step 4: Insert the values into the Spearman rank correlation coefficient formula.
Spearman rank correlation coefficient Formula:
ρ = 1-6∑di²n(n²-1)
ρ = 1-(6*10)9(81-1)
ρ = 1-6∑di²n(n²-1)
ρ = 60720
ρ = 0.9
Thus, The Spearman’s rank correlation coefficient for this series of data is 0.9.
Conclusion:
The Spearman rank-order correlation coefficient is a non – parametric indicator to measure the strength of association among two parameters measured on a statistical data type where variables are in order of a rank. It is represented by the Greek letter ρ and pronounced as rho.
The Spearman correlation coefficient can range between +1 and -1.
- If the Spearman correlation coefficient is +1 – It means perfect positive rank association,
- If the Spearman correlation coefficient is 0 – It means no rank association,
- If the Spearman correlation coefficient is -1 – It means perfect negative rank association.
The stronger the association between the ranks, the closer is the Spearman correlation coefficient to zero.