Now we will see how the topic of Correlation is used in statistics. Correlation is very useful when you have two variables and you have to determine a relationship between those two variables. So, Correlation is used to determine or establish a relationship between two variables. In this article, we will discuss how to measure this “Correlation” between two variables. The Correlation between two variables can be measured using a quantity called the Correlation Coefficient.
Correlation Coefficient
As we have discussed earlier, the Correlation Coefficient is used to measure the Correlation between two variables. The Correlation in turn is used to determine what type of relationship exists between the two variables. Now we will see how to calculate that Correlation Coefficient. One of the most popular approaches to calculating the Correlation Coefficient is to calculate Pearson’s Correlation, one of the most popular Correlation Coefficients.
Pearson’s Correlation is used mostly in the case of Linear Regression problems. Pearson’s Correlation is more commonly known as Pearson Product Moment Correlation (PPMC). Some disadvantages of this Correlation Coefficient are:-
- Pearson Product Moment Correlation cannot differentiate between a Dependent Variable and an Independent Variable.
- It is inconsistent with different sets of data. So we have to be very careful when dealing with different sets of data with Pearson’s Correlation.
- In a Person’s Correlation, we draw a line if there’s a relation between two sets of data. Pearson’s Correlation will not give you the slope of the line, it will only tell you if a line can be drawn.
Now we will see how to calculate The Pearson’s Correlation Coefficient.
The formula for finding The Pearson’s Correlation Coefficient is:-
(Note: This formula is just another variation of the Pearson’s Correlation Coefficient.)
In this formula,
r: Pearson’s Correlation Coefficient.
xi: Value of the x-variable.
yi: Value of the y-variable.
x̅: Mean of the values of the x-variable.
y̅: Mean of the values of the y-variable.
Now we will the concept of Diagram which will help us to better understand this topic.
Scatter Diagram
To get the Scatter Diagram, we plot the values of the first variable along one axis and the values of the second variable along the other axis. For example, if we have an x-variable and another variable y-variable, we could plot the values of the x-variable along the X-axis and the values of the y-variable along the Y-axis. Now we apply the Regression Model. So what the application of the Regression Model does is that it gives rise to one Dependent Variable and one Independent Variable. The Independent Variable will have some kind of effect on the Dependent Variable. The use of the Scatter Diagram is that it helps us in determining the type of Correlation between the two variables. The type of Correlation between the two variables is determined by the direction in which the plot is moving.
Types of Correlation
When we draw a Scatter Diagram representing the two variables for which we want to find out the correlation. So there can be 3 possible types of Correlations based on the Scatter Diagram drawn between the two variables.
- Positive Correlation: – When we draw the Scatter Diagram for the two variables and we find out the change in the values of both the variables move in the same directions, it means that there exists a Positive Correlation between the two variables. So what basically means that if the value of one variable increases, the value of the other variable also increases. While, if the value of one variable decreases, the value of the other variable decreases. An example of Positive Correlation is the relationship between time and distance. If two particular places are located far away (increase), the time taken to go from one place to another is also more (increase) and vice-versa.
- Negative Correlation: – When we draw the Scatter Diagram for the two variables and we find out that the change in the values of both the variables moves in different directions, it means that the two variables have a Negative Correlation between them. What that implies is that, if the value of one variable increases, the value of the other variable decreases. An example of a Negative Correlation that can be considered is the relationship between the Speed of a vehicle and the time taken by the vehicle to go from one place to other. If the Speed of a vehicle is more (increase), the time taken to go from one place to the other is less (decrease) and vice-versa.
- Zero Correlation: – The Zero Correlation is the implication that there exists absolutely no relationship between the fore-mentioned two variables. This basically means that the two variables are completely independent of each other. The increase and decrease in the value of one variable are independent of and don’t affect the increase and decrease in value of the other variable. An example of Zero Correlation is the relationship between the height of a student and his average marks in an exam.
Conclusion
In the article, we described the concept of Correlation. Then we moved on to learn the concept of how to calculate the Correlation Coefficient. After that, we learned what a Scatter Diagram is and how to draw one. Finally, we have discussed the three very important types of Correlation.