Any measurement that is made in an experiment is to detect a change. Without change, there is no meaning to experimenting. Similarly, the world is governed by formulas of classical mechanics that are based on changes in variables. Understanding the change that occurs in a variable based on the changes in another variable is an important way of measurement.
The parameters used to measure such kinds of relative changes that happen in a variable based on some other variables are correlation and covariance. These parameters help measure how much a variable changes and the relationship that two variables share when one variable changes based on the changes in the other variable.
Covariance:
Covariance is the measure of the related variability of two random variables. It gives an idea of the kind of relationship that two variables share. When the larger values of a random variable X match with the larger values of a random variable Y and at the same when the smaller values of the random variable X match with the smaller values of the random variable Y, then the covariance between them is said to be positive.
On the contrary, when the smaller values of the random variable X match with the larger values of the random variable Y and the larger values of the random variable X match with the smaller values of the random variable Y, then the two variables are said to be in negative covariance to each other.
In simple words, it can be said that when two random variables display a covariance that has a positive sign, then it can be said that as the value of one random variable increases, the value of the second random variable also increases. However, when the covariance between two variables has a negative sign, then it can be said that the increase in the value of one random variable leads to a decrease in the value of the second random variable.
The formula for the covariance of two random variables, X and Y, that take up a fixed number of values is;
Cov(X,Y)=i=1∑n*(xi-x)(yi-y)/N
where
xi are the values taken up by X
yi are the values taken up by Y
x is the mean of X
y is the mean of Y
N is the total number of values
Correlation:
With covariance, one can find out what kind of relationship two random variables share and give the idea of the linear relationship between them. But the strength of the linear relationship between two random variables is given by the correlation coefficient of the two random variables.
With the correlation coefficient, one can predict to what degree a random variable changes when correlated to another random variable. The correlation coefficient, much like the covariance of two random variables, bears a positive or a negative sign based on the linear relationship between the two random variables. A positive sign indicates that an increase in the value of one variable will translate to an increase in the value of the other correlated variable. A negative sign would indicate that an increase in the value of one variable would translate as a decrease in the value of the other correlated value.
The correlation coefficient is derived from the covariance between two random variables. While the covariance gives us an idea of the linear relationship between two random variables, the correlation strengthens this relationship. However, it does not give the cause of this strength or its effect outside of the two random variables in the discussion. The formula for correlation coefficient can be given as;
ρX,Y=corr(X,Y)=Cov(X,Y)/σXσY
Covariance and Correlation Importance:
Without covariance between variables, it would be difficult to find out if two variables are covariate and understand what changes happen to two variables when they are covariate. While having the knowledge of the covariance between two variables gives us how they will change with variations in one another, a correlation coefficient is required to judge the degree to which this change will happen accurately.
The covariance of the dataset gives us an idea of what predictions can be expected when we make certain changes to the input values. It is very useful in fields like predictive modelling since it gives us a basic idea of what prediction we can expect if we already know the relationship between the input and output variables.
The correlation coefficient may not give the cause for the linear relationship between two random variables, but it is very important to judge the strength of this relationship. For example, if random variable X takes up all the input values in an experiment and random variable Y takes up all the predicted output values. Then the correlation coefficient will give the best fit line to the dataset of X and Y and give an idea of how far the actual dataset is from the predicted dataset.
Conclusion:
Understanding the change that occurs in a variable based on the changes in another variable is an important measurement. The parameters used to measure such kinds of relative changes that happen in a variable based on some other variables are correlation and covariance.
Covariance is the measure of the related variability of two random variables. It gives an idea of the kind of relationship that two variables share. When the larger values of a random variable X match with the larger values of a random variable Y and at the same when the smaller values of the random variable X match with the smaller values of the random variable Y, then the covariance between them is said to be positive and vice versa.
With the correlation coefficient, one can predict to what degree random variable changes when it is correlated to another random variable. The correlation coefficient, much like the covariance of two random variables, bears a positive or a negative sign based on the linear relationship between the two random variables.