Correlation Rank

Spearman Rank Correlation, Rank Correlation Formula, Spearman's Rank Correlation CoefficientThe rank correlation method is used to determine the degree of relationship between two variables. The rank correlation coefficient is calculated similarly as the Pearson correlation coefficient. Still, it is determined using the ranks of the observational data rather than their quantitative data.

Introduction:

A rank correlation is a statistical method to quantify a countable relationship between variables. The relationship can be between different arithmetical variable rankings or different rankings of the very same attribute, where rankings refer to the process of assigning identifiers as first, second, third, etc., to distinct observations of a selected variable.

A rank correlation coefficient calculates the similarity between two ranking lists and can determine the importance of the relationship between them. When the data is not obtainable in the form of a number or statistical values, but the information is adequate to rank & classify the data, the Correlation Rank method is useful. The well-known rank correlation statistics technique is Spearman Rank Correlation.

Rank correlation statistics:

Some of the more well-known rank correlation statistics are as follows:

  • Spearman’s ρ: Spearman’s rank-order correlation is the nonparametric model of the Pearson product-moment correlation. Spearman’s correlation coefficient assesses the strength and significance of the relationship between two ranking factors.
  •  Kendall’s: Kendall’s method is for determining the arithmetical relationship between two units of measure. Kendall’s test is a nonparametric statistical technique based on the coefficient for statistical dependence.
  • Goodman and Kruskal’s 𝛄: Goodman and Kruskal’s gamma (𝛄) is a statistical method that measures rank correlation, or the correlation of data orderings when ranked by several quantities. When both variables are measured at the ordinal level, it measures the degree of association of the cross-calculated results.
  • Somers’ D: Somers’ D measures the quantitative relationship between two potentially dependent unknown variables, X and Y. Somers’ D considers values ranging from when all pairs of variations disagree to when all pairs of variations agree.

Spearman Rank Correlation:

To use Spearman Rank Correlation methods, you’ll need two arithmetical, time frame, or proportion variables. Although you would normally use the Pearson product-moment correlation on interval or ratio data, the Spearman correlation can be used when the assumptions of the Pearson correlation are fundamentally flawed. Spearman’s correlation, on the other hand, determines the direction and the strength of a monotonic correlation between the two variables rather than the direction and strength of a linear correlation between two paired data, as Pearson’s correlation does. 

A monotonic relationship is one in which:

  • Because the cost of 1 variable increases, so does the cost of the opposite variable; or
  • Because the cost of 1 variable increases, the opposite variable cost decreases.

Importance of monotonic relationship to Spearman Rank correlation:

Spearman’s correlation does not require a monotonic relationship. That is, you can use Spearman’s correlation to see if there is a monotonic component to a non-monotonic relationship. One will usually choose Spearman’s correlation method as it is the best fit for the observed data pattern. When the scatterplot reveals the monotonic relationship between two variables, Spearman’s correlation is used to determine the correlation of this monotonic relationship. 

On the other hand, if the scatter plot reveals a linear relationship between two variables, Pearson’s correlation is used to determine the correlation of this linear relationship. As it is not possible to keep checking if monotonic relationships are being determined, it is better to run Spearman’s correlation method.

Rank Correlation Formula:

The formula for the Spearman rank correlation coefficient is:

ρ = 1-6∑di²n(n²-1)

ρ = Spearman’s rank correlation coefficient

di = difference between the two ranks of each observation

n = number of observations

Let’s understand the Rank Correlation Formula with an example.

Example: The marks for nine students in Biology and Statistics are as follows:

Biology: 36, 24, 48, 18, 11, 44, 10, 7, 29

Statistics: 31, 32, 46, 24, 9, 50, 13, 5, 33

Calculate the student’s ranks in Biology and Statistics using the Spearman rank correlation.

Solution:

Step 1: Determine the rankings for each subject. To determine the rankings, use the Excel rank function. If you just want to rank by hand, arrange the scores from highest to lowest; designate the rank 1 to the top total, 2 to the next greatest, and so on:

Biology

Rank

Statistics

Rank

36

3

31

5

24

5

32

4

48

1

46

2

18

6

24

6

11

7

9

8

44

2

50

1

10

8

15

7

7

9

5

9

29

4

33

3

Step 2: To your data, add a third column, d The d computes the change in rank. For example, if the first student has a Biology rank of 3 and a Statistics rank of 5, the difference is 2 points. Square your d values in a fourth column.

Biology

Rank

Statistics

Rank

d Value

d squared

36

3

31

5

2

4

24

5

32

4

1

1

48

1

46

2

1

1

18

6

24

6

0

0

11

7

9

8

1

1

44

2

50

1

1

1

10

8

15

7

1

1

7

9

5

9

0

0

29

4

33

3

1

1

Step 3: Sum up all the ⅆ squared values.

             = 4 + 1 + 1 + 0 + 1 + 1 + 1 + 0 + 1 = 10. 

This is required for the formula ∑d². It is the sum of d squared values. 

Step 4: Insert the values into the Spearman rank correlation coefficient formula. 

Spearman rank correlation coefficient Formula:

 ρ = 1-6∑di²n(n²-1)

 ρ = 1-(6*10)9(81-1)

ρ = 1-6∑di²n(n²-1)

ρ = 60720

 ρ = 0.9

Thus, The Spearman’s rank correlation coefficient for this series of data is 0.9.

Conclusion:

The Spearman rank-order correlation coefficient is a non – parametric indicator to measure the strength of association among two parameters measured on a statistical data type where variables are in order of a rank. It is represented by the Greek letter ρ and pronounced as rho. 

The Spearman correlation coefficient can range between +1 and -1.

  • If the Spearman correlation coefficient is +1 – It means perfect positive rank association,
  • If the Spearman correlation coefficient is 0 – It means no rank association,
  • If the Spearman correlation coefficient is -1 – It means perfect negative rank association.

The stronger the association between the ranks, the closer is the Spearman correlation coefficient to zero.

faq

Frequently asked questions

Get answers to the most common queries related to the CA Foundation Examination Preparation.

When can rank correlation be used?

Ans :A rank correlation coefficient calculates the measure of correlation & similarity between ...Read full

What are the merits of the rank correlation method?

Ans :The merits of the rank correlation method are as follows: ...Read full

What are the demerits of the rank correlation method?

Ans :The demerits of the rank correlation method are as follows: ...Read full