Introduction
A data set marks its importance when it is clear, understandable and precise. Measures of central tendency and dispersion are vital tools to draw a short interpretation of data. Central tendency and dispersion tools can be defined as finding approximate points in a data set capable of representing the whole sample or data.
The main aim of these measures is to give a few values for getting the right conclusion. A person can conclude based on one value instead of considering the whole sample. The following section shows six measures of central tendency and dispersion.
6 Vital Measures of Central Tendency and Dispersion
The following central tendency and dispersion tools are the most commonly used ones.
Mean
Mean provides a rough estimate of the most common value in a sample. It is commonly known as the average of a data set. Calculating the mean of a data set is a simple process. A person can find the average by summing all the data values and dividing them by the total number of items.
Formula
Mean =Sum of all the termsNumber of observations or items
Example
When a person is required to find girls’ weight in a class that represents the overall weight of the girls, mean is used. For example, five girls in the class have the following weights – 50, 42, 48, 63, and 47. The average weight will be 50 (250/5) by the above formula.
Median
Median is one of the easiest measures of central tendency and dispersion to compute. It is the middlemost value in a data series. It is calculated by arranging the data set in either ascending or descending order. Many people make errors while calculating the median by not arranging the series in proper order. Even if a single value is not placed correctly in order, it will affect the median.
Formula
Median =( Number of observations + 12)th observation (In case the number of observations is odd)
In case the number of observations is even, the median is taken as the average of (n/2)th and {(n + 1)/2}th observations.
Example
Median is widely used in an organization where the data sets are filled with extreme variations in values. For example, in a data set of 5, 8, 10, 20, 40, 45, 70, 100, the median is 30 (60/2).
Mode
Mode is the most occurring value of the data series. It gives the exact value present in the data set. There might be a different number of modes in a data set. One significant thing to understand in measures of central tendency and dispersion notes is that the mode of a sample can be zero when all the observations in the data occur only once or the same number of times.
Formula
A person should arrange the data in ascending or descending order. They should then count the number of times an item occurs.
Example
When a person has to find the preferences of students for a course. For example, in a series of 2, 3, 6, 6, 6, 8, 9, 10, 10, 10, 10, 20, the mode is 10.
Range
Range is defined as the difference between the highest and the lowest value of a sample. It is the most time-efficient measure of central tendency & dispersion.
Formula
Range = Highest value – lowest value
Example
Data series: 10, 60, 20, 80, 170, 90, 30. The range of the given series is 160 (170-10). It gives the variation of the entire data series.
Upper & Lower Quartiles
A quartile is defined as the median or the middlemost value of two halves of a data set. The lower quartile measures the median of the lower half of the series while the other calculates the median of the upper half.
Formula
The formula is the same as the median. The only difference is that a person will need to identify the upper and lower half of the data, and then calculate the median of both halves separately.
Example
Data series: 2, 4, 5, 5, 7, 9, 11, 15. In this example, the lower half is 2, 4, 5, 5, while the other observations constitute the upper half. Q1 (quartile of lower half) is 4.5, while Q3 (quartile of upper half) is 10.
Variance And Standard Deviation
These central tendency and dispersion measures calculate the variation and change of the data set and observations from its mean. Variance is the square of standard deviation.
Formula
Standard Deviation =√ (xi – m)^2/n
Where xi means the ‘ith item value, ‘n’ refers to the number of observations, and ‘m’ means the average or mean.
Example
Data series: 3, 5, 7, 13. Mean = 7. The standard deviation will be calculated as follows.
Items | xi – m | (xi – m)2 |
3 | -4 | 16 |
5 | -2 | 4 |
7 | 0 | 0 |
13 | 6 | 36 |
Sum = 56 |
Standard deviation = 18.667= 4.32
Variance = 18.667
Conclusion
These statistics measures hold the most integral part of analyzing and interpreting a data set. From mean to standard deviation, every measure can give valuable insights to the audience.
The most crucial benefit of using these measures is that a person can find the most common value, centre value, spread of data, etc., using simple formulae. It does not require any rocket science or high knowledge to solve daily problems; understanding a few central tendencies and dispersion measures will work most of the time.