In the theory of probability, the CLT, i.e., the central limit theorem defines the distribution of a sample variable that approaches an ordinary distribution (bell curve) as the sample size becomes bigger, supposing that all the samples are similar or equivalent in size, as well as not considering the actual shape of the population.
Additionally, this theorem is a statistics-based theorem that is provided with an adequately huge size of the sample from a population having a variance of fixed level. Moreover, the mean of the total sampled variables that belong to the same population will be approximately equivalent to the mean of the entire population.
History of the central limit theorem
The initial form of the central limit theorem was introduced by none other than Abraham De Moivre, a French mathematician. According to an article issued in the year 1733, Abraham De Moivre used the ordinary distribution for determining the number of heads resulting from numerous tosses with a coin. This concept didn’t get much attention at that time and was ignored. However, in the year 1812, this concept was introduced again by Pierre Simon Laplace, who was also a mathematician from France. He titled the reintroduction of this concept as ‘Théorie Analytique des Probabilités’. Additionally, he tried to approximate the binomial distribution with the normal distribution. Furthermore, the mathematician figured out that the average of autonomous random variables, when enlarged in number, is used to follow an ordinary distribution.
Central limit theorem’s properties
Let’s have a look at the normality features of the central limit theorem more specifically. The ordinary distributions comprise two parameters, the mean as well as the standard deviation. There are some values that these parameters converge on at the same time. With the increase in the size of the sample, the sampling distribution converges on an ordinary distribution in which the mean is equivalent to the population mean. Also, the standard deviation is equivalent to the σ/√n.
Here,
σ = the population standard deviation
n = the size of the sample
The standard deviation of the sampling distribution gets tinier as the size of the sample increases. This happens because the square root of the sample size is in the denominator. That is to say, the sampling distribution clusters more closely around the mean when there is an increase in the size of the sample.
Bringing all this together, when the size of the sample increases, the sampling distribution approaches the ordinary distribution more narrowly. At the same time, the extent of that distribution becomes tight. These properties have important implications in Statistics.
How does the central limit theorem work?
This theorem sets the base for the probability distribution. It makes it uncomplicated to better understand how the population estimates act when subjected to recurring sampling. While we plot it on a graph, this theorem presents the shape of the distribution that is created by means of recurring samples of the population.
As the sizes of the sample increase, the mean distribution from the recurring samples tends to be normal, while looking like an ordinary distribution. However, the result stays the same irrespective of the original shape that the distribution had.
Central limit theorem examples
An investor shows interest in estimating the return of the XYZ stock market index which has 1,00,000 stocks. Because of the bigger size of the index, the investor is not able to examine every single stock independently and instead opts to apply random sampling to get an approximation of the overall return of the index. The investor opts for the random samples of the stocks, where each sample has 30 stocks. The samples should be random and, at the same time, any formerly opted sample should be changed to the succeeding samples for avoiding bias.
If the initial sample gives an average return of 7.5 percent, the next sample might give an average return of 7.8 percent, at least. Moreover, with the nature of random sampling, every single sample will make a dissimilar outcome. As the size of the sample is increased with each sample that is picked, the sample means will start producing their self-distributions.
Furthermore, the distribution of the sample means will travel toward the ordinary as the value of ‘n’ is increased. In addition, the stock’s average return in the sample index approximates the return of the entire index that belongs to 1,00,000 stocks, as well as the average return, is distributed in general.
Conclusion
The central limit theorem statement says that the mean of a data sample will be nearer to the overall population‘s mean in the question, as the size of the sample rises, despite the real distribution of the data. That is to say, the data is exact whether the distribution is aberrant or normal. Moreover, the central limit theorem statement questions should be practised as much as possible for better understanding.