The numerical measure of association that tests whether a relationship exists between two variables is called the correlation coefficient. The variables can be two columns consisting of a data set of given observations called samples, and two statistical variables called distribution components. There are several types of correlation coefficient exits; each has its own definition, characteristics, and range of usability. 0 is the strongest possible disagreement, and they all assume values in the range -1 to +1, whereas ±1 indicates the strongest possible disagreement.
Karl Pearson Coefficient of Correlation Direct Method Formula
Karl Pearson’s correlation coefficient formula is also known as r, R, or Pearson’s r. The Karl Pearson method is a measure of the strength or direction of two linear relationships between two variables divided by standard deviation. The Karl Pearson coefficient of correlation is also called Cross-Correlation as it predicts the relationship between two quantities.
Karl Pearson Coefficient of Correlation solved Examples
Here is Karl Pearson coefficient of correlation example for your better understanding. Given that there are two variables of X and y calculate the Karl Pearson coefficient of correlation from the following data.
X | 1 | 2 | 3 | 4 | 5 |
Y | 10 | 20 | 30 | 40 | 50 |
To calculate the above table data using the Karl Pearson coefficient of correlation direct method formula.
x | y | X square | Y square | XY |
1 | 10 | 1 | 100 | 10 |
2 | 20 | 4 | 400 | 40 |
3 | 30 | 9 | 900 | 90 |
4 | 40 | 16 | 1600 | 160 |
5 | 50 | 25 | 2500 | 250 |
So,
∑x=15
∑y=150
∑X square= 55
∑Y square= 5500
∑XY= 550
Types of the Correlation Coefficient
There are primarily two types of correlation –positive correlation and negative correlation. The positive correlation is when the value of 1 variable increases linearly with the increase of the other variable; this signifies a similar relation between the two variables. So, in this case, the value of this correlation will be positive or 1. The negative correlation is when there is a decrease in the value of one variable with a similar decrease in the value of other variables. The correlation could be negative in this case. In the case when there is no relation between two variables, then it is said to be No correlation or zero correlation.
When applied to a population, Karl Pearson’s coefficient of correlation theory is commonly represented by the greek letter ρ (rho) and referred to as the population correlation coefficient or the population Pearson correlation coefficient. We can be calculated the Karl Pearson coefficient of correlation from the following data given in the form of pair of random variables (X, Y) the formula ρ (rho) is:
ρ (X, Y) = Cov (X, Y) / σX.σY
Cov(X, Y) represents covariance
σX represents the standard deviation of X
σY represents the standard deviation of Y
Assumptions of Karl Pearson Correlation Coefficient
The requirements and assumptions to calculate the Karl Pearson coefficient of correlation from given data are:
- To calculate the coefficient from the given data set should be approximate to normal distribution. And If the data set is normally distributed, then the data set tends to lie closer to the mean.
- Linearity is called when the data follow a linear relationship. The data satisfies the condition of linearity if the data points form a straight line on the scattered plot.
- Continuous variables could take any of the values in an interval; the data must consist of continuous variables to calculate the Karl Pearson correlation coefficient.
- Paired observations, the data set must be in pairs. For each observation of independent variables, there exists a dependent variable.
- If the outliers data is present, they disable the correlation coefficient and make it inappropriate; thus, there must be no outliers in the data set. If the point is beyond +3.29 or -3.29 standard deviations away, then the data point is considered to be an outlier.
Karl Pearson’s Coefficient of Correlation Direct Method Formula
When the formula of Pearson’s coefficient of correlation is represented by rxy when applied to the sample, it can also be termed the sample correlation coefficient or the Pearson correlation coefficient.
Conclusion
The numerical measure of association that tests whether a relationship exists between two variables is called the correlation coefficient. Karl Pearson’s correlation coefficient formula is also known as r, R, or Pearson’s r. The Karl Pearson method is a measure of the strength or direction of two linear relationships between two variables divided by standard deviation. To calculate the coefficient from the given data set should be approximate to the normal distribution. If the data is normally distributed, then the data set tends to lie closer to the mean. When the formula of Pearson’s coefficient of correlation is represented by rxy when it is applied to the sample, it can also be termed as the sample correlation coefficient or the Pearson correlation coefficient.