Introduction
The variability in a sample or population is described using a measure of spread, also known as dispersion. To provide an overall description of a data set, it’s frequently used in conjunction with an estimate of central tendencies like the mean or median.
Importance Of Measuring the Spread Of Data
The link between the measure of data spread and measures of central tendency is one of the key reasons it is significant. A measure of spread indicates how well a statistic, such as the mean, represents the data. The standard does not display the data when the distribution of values in the data set is limited. This is because widespread suggests that individual scores are likely to differ significantly. Furthermore, in the study, slight fluctuation in each data group is frequently considered a favorable sign, indicating similar groups.
Range:
The range is one of the fundamental topics in statistics. It is defined as the difference between the paper and lower values in a given data. There may be cases where either grouped data is provided or ungrouped information.
(i) If the data is ungrouped:
In this case, the general formula of finding range is used, that is,
Range= upper value- lower value.
For example, if we are given the values- 2,5,6,7,2,49,10,4,
Then the content can be calculated as 49-2=47.
(ii) If the data is grouped:
The range is the most straightforward measure of dispersion when given in the grouped form. In this case, the information is divided into different intervals.
Range= Upper-class boundary of the highest interval – The lower class boundary of the lowest gap,
For example,
Class Interval | 10 – 20 | 20 – 30 | 30 – 40 | 40 –50 |
Frequency | 2 | 3 | 14 | 8 |
Range = Upper-class boundary of the highest interval (U) – The lower class boundary of the lowest interval (L)
Here, U = 50
L= 10
Therefore, range = 50 – 10 = 40
Variance
If we are provided with a set of values, the variance measures how data points differ from the mean. The higher the variance value, the more are the values scattered from the norm. On the other hand, if the variance value is low, the data is less scattered from the mean. Hence, it measures the spread of data from the standard.
Var (X) = E[( X – μ)2]
Where,
X = Random variable
µ = E(X)
Also, variance is the square of standard deviation,
Variance = (Standard deviation)2= σ2
For example, the distances are 620, 450, 170, 420, 310.
We know that mean and variance are interrelated.
Mean = (620+450+170+420+310)/ 5 = 394
Now, we can calculate the variance by finding the difference in the mean values. The next step is to find the average again.
Hence, the variance in this case is
= (2262 + 562 + (-224)2 +262 + (-84)2)/5
= (51076+ 3136 + 50176+ 676+ 7056)/5
Variance = 22424
Quartile deviation
Another essential concept in statistical studies is that of quartiles. By definition, quartiles are the implications that apportion a list of numerical data into three-quarters, such as Q1, Q2, and Q3. The quartile deviation is the difference between the upper and lower quartile half. In this case, the quartile deviation can be written as QD. Also, it may be noticed that Q3 refers to the upper quartile, and Q1 refers to the lower quartile. QD is also known as the Semi Interquartile range.
QD = (Q3 – Q1)/2
Q1 = [(n+1)/4]th item
Q2 = [(n+1)/2]th item
Q3 = [3(n+1)/4]th item
For example, we have been given the following set of data, and we need to find the Q1, Q2, Q3, and QD:
17, 2, 7, 27, 15, 5, 14, 8, 10, 24, 48, 10, 8, 7, 18, 28
First, the data is arranged increasingly-
2, 5, 7, 7, 8, 8, 10, 10, 14, 15, 17, 18, 24, 27, 28, 48
Total number of given values = 16
Q2 is defined as the median of the provided set of values
As n is even,
median = (1/2) [(n/2)th value + (n/2 + 1)th value]
= (1/2)[8th value+ 9th value]
= (10 + 14)/2= 12= Quartile 2
The lower half of the given values is:
2, 5, 7, 7, 8, 8, 10, 10 (n= even)
Q1 is defined as the median of the lower half of the given values
= (1/2)[4th value+ 5th value]
= (7 + 8)/2
= 7.5
The upper half of the given values is:
14, 15, 17, 18, 24, 27, 28, 48 (n= even)
Q3 is defined as the median of the upper half of the given values
= (1/2)[4th value+ 5th value]
= (18 + 24)/2
= 21
QD = (Q3 – Q1)/2
= (21 – 7.5)/2
= 6.75
Hence, QD=6.75
Mean Deviation
Mean deviation is another concept in statistics used to determine the average deviation from the mean value of the provided set of values. To calculate the mean deviation of the data values, follow these steps:-
- The first step is determining the mean value for the given data values.
- Now, subtract the mean value from each of the data values given. We do not consider the minus sign.
- We calculate the mean from the previous step from the values obtained in the last step.
Standard Deviation
The standard deviation is used to summarize continuous data, not categorical data, in conjunction with the mean. Furthermore, like the mean, the standard deviation is typically only acceptable when the serial data is not severely skewed or has outliers.
Conclusion:
Range, Quartile Mean Deviation, and Standard Deviation are absolute measurements of dispersion that determine the spread of values. These measures express distribution in the series’ original unit; hence they can’t compare statistical data with various teams.