Central Limit Theorem: Proof by Simulation

CARivero

February 2, 2019

Central Limit Theorem

The Central Limit Theorem states that given samples of independent and identically distributed random variables from a population distribution, and given that the samples are sufficiently large with size \(n\), then the sampling distribution of the sample mean will have the following properties:

Interactive Proof!
The Shiny application will allow the user to prove this theorem by simulation of exponentials, with user-provided rate \(lambda\) and sample size \(n\). A histogram will then be returned to show the shape of the distribution of sample means, along with the values of the theoretical and simulated means and standard errors.

Shiny App Interface

Given the default input values \(\lambda = 0.2\) and \(n = 40\), the Shiny app will return the following output, among others:

Statistics Mean SEM
Theoretical (CLT) 5 0.791
Simulated 5.011 0.815

Shiny App Input and Output

Background Calculations

## Define INPUT variables using defaults
lambda <- 0.2; n <- 40

## Calculate OUTPUT variables (theoretical statistics) in sidebar
Theoretical <- list()
Theoretical$pop.mean = round(1/lambda,3); Theoretical$pop.sd = round(1/lambda,3)
Theoretical$CLT.mean = Theoretical$pop.mean; Theoretical$CLT.SEM = round(Theoretical$pop.sd/sqrt(n), 3)

## Calculate OUTPUT variables (Simulated statistics) in tabs
set.seed(3049);
sample <- matrix(rexp(1000*n, lambda), nrow = 1000, ncol = n)
means <- apply(sample, MARGIN = 1, FUN = mean)
Simulated <- list()
Simulated$samp.mean = round(mean(sample),3); Simulated$samp.sd = round(sd(sample),3)
Simulated$sampmean.mean = round(mean(means),3); Simulated$sampmean.SEM = round(sd(means),3)

unlist(Theoretical); unlist(Simulated)
pop.mean   pop.sd CLT.mean  CLT.SEM 
   5.000    5.000    5.000    0.791 
    samp.mean       samp.sd sampmean.mean  sampmean.SEM 
        5.011         5.017         5.011         0.815