We use a scatter diagram to examine the relation between the x-axis and y-axis through one variable. If the two points happen to be correlated, they will make a curve on the graph by falling on the same line. Thus, using a scatter diagram, we can get an idea about the nature of the coordinates given to us. When the points make up one straight line, the correlation is perfect and in unity. Further, there is a low correlation if the coordinates are scattered throughout the line. Lastly, there is a linear correlation between them when the scatter point lies on a line or near it. Read with us to know more about measuring correlation using scatter diagrams.
What Is Correlation?
Correlation is a statistical measure that helps identify the linear relationship between two variables, showing that they can alter at a constant rate. The coefficient used for correlation is r that measures the strength a correlation has. This variable does not tell us anything about the cause and effect but only the strength of the relation between the two variables.
Let us take an example to understand the correlation better. Suppose we consider two variables, altitude, and temperature. By taking different readings, we can understand the correlation between the two. We will discover that there is a linear relationship between these variables. With an increase in the elevation, there is a drop in temperature. Thus, the two variables are negatively correlated.
The correlation coefficient ranges from -1 to +1 and gets denoted by the coefficient r. Further, the p-value gives the statistical significance. Thus, a correlation is given by two values, r and p.
Some correlation pointers to be kept in mind are:
- The closer the linear correlation is with 0, the weaker it will be.
- All positive values of r represent a positive correlation, stating that the value of both variables will increase together.
- All negative values of r represent a negative correlation, stating that with an increase in the value of one variable, the value of the other variable will decrease.
Types Of Correlation
Every correlation has two factors, strength and direction. To determine the strength of a relationship, look at its numeric value. To determine the direction of the correlation, we see if it is a positive value or a negative value.
- Positive correlation: Both the variables move in a similar direction. If one variable increases, the other increases alongside. Similarly, if one variable decreases, the other decreases alongside. For example, there exists a positive correlation between the number of years of experience and the salary.
- Negative correlation: Both the variables move in opposite directions to each other. If one variable increases, the other decreases. Similarly, if one variable decreases, the other increases. For example, the altitude and the temperature are negatively correlated. When the altitude increases, the temperature falls, and vice versa.
- No correlation: When there is no apparent relationship between two variables, the variables are not correlated. An increase or decrease in the value of one variable does not affect the other. For example, there is zero correlation between the salary of an employee and his shoe size.
Further, the strength of the correlation determines how strong is the relationship between the two variables. There exists a perfect correlation if the value of correlation is +1 or -1. There exists no correlation between the two variables if the value of the correlation is 0.
Having learned what a correlation is, we must now understand what a scatter diagram is and how to use it.
How To Use a Scatter Diagram?
A scatter diagram, also known as a scatter plot and X-Y graph, is used for representing the pairs with one variable on each axis to derive a relationship between them. Once the points are marked, the more closely the points knit to the line, the stronger the correlation.
We use a scatter diagram when we have paired numerical data. The procedure to build a scatter diagram is:
- Collect the paired data to identify the relationship between the two variables.
- Construct a graph by plotting the independent variable on the horizontal axis and the dependent variable on the vertical axis. Further, for every paired observation, draw a dot at points where the x and the y-axis intersect.
- In some cases, the relation becomes obvious by plotting the paired data. You can then use correlation analysis. However, if the relationship is not yet evident, follow the next steps.
- Divide all the points plotted on the graph into four quadrants. Let us suppose there are total X points on the graph, then:
- Count X/2 points from the top to the bottom and make a horizontal line.
- Count X/2 points from right to left and make a vertical line.
- In case of an odd number of points, make a line through the middle point.
a. Total the points that fall in each quadrant, without counting the ones that are part of the line.
b. Sum up the quadrants falling diagonally opposite to each other. Find the sum and total of the points in each quadrant.
- Let A be the total of the points in the upper left and lower left quadrant.
- Let B be the total of the points in the upper right and lower right quadrants.
- Let Q represent the value that is smaller between A and B.
- N = A + B
- Now, check the values through the trend test table to examine the correlation.
- The variables will be related when Q is less than the limit.
- The variables will not be related when Q is greater than or equal to the limit.
Points to Remember
- Even when the two values appear related, you should not assume that they determine each other. There can also be a third value influencing the two values.
- After plotting the data, the closer the points are plotted, the stronger relation exists between the two variables.
- In case of unclear conclusions from the plotted graph, the values of N and Q can help determine the relationship between the two variables. When no relationship gets discovered, the pattern might have occurred by chance.
- When there is zero correlation between the two variables, consider the data stratified.
- When no correlation occurs between the two variables, verify the observations of the independent variable and check if they are varied widely. Sometimes a relationship is not discovered because of the data not covering a wide range.