Central Limit Theorem

Central Limit Theorem states that the sample means will be approximately normally distributed for a large sample size regardless of the distribution from which the sample is taken.

Assume you are sampling from a population with mean  \mu and standard deviation of  \sigma and let  \overline{X} be representing sample mean of  nof independently drawn observations, then

The mean of the sample distribution of the sampling distribution of the sample means is equal to the population mean.

 \mu \overline{x} = \mu

The standard deviation of the sampling distribution of  \overline{X}is equal to:

 \sigma \overline{x} = \frac {\sigma}{\sqrt{n}}

Now, If the population is normally distributed then,  \overline{X} is also normally distributed.

What if the population is not normally distributed?

The central limit theorem addresses this question.

The distribution of the sample mean tends towards normal distribution as the sample size increases, regardless of the distribution from which we are sampling.
Central Limit Theorem
Original Population Distribution
Central Limit Theorem
Central Limit Theorem
Central Limit Theorem
Central Limit Theorem

You may refer to an example of the Central Limit Theorem in the link below.

http://sasnrd.com/sas-central-limit-theorem/

Properties of Central Limit Theorem

  • The variable  \frac{X- \mu}{\sigma/\sqrt(n)}will be a standard normal distribution where mean = 0 and standard error = 1.
  • The sampling distribution of large sample where  (n > 30)will follow the normal distribution with mean same as the population mean and standard deviation  \sigma \sqrt{n}
  • The mean of sample means will be equal to the mean of the population which means if you take the average of all samples means it will be equal to the average of the population. Note that, this depends on the sample size.
  • If the Population is normally distributed then the mean of the sample will be normal irrespective of the sample size.

Central Limit Theorem for proportions

Example:

It is believed that college student spends on average 65.5 minutes daily on texting using their cell phone and the corresponding standard deviation is 145 minutes. Data from a sample of 100 students were collected for calculating the amount of time spent on texting.

Calculate the probability that the average time spent by this sample of students will exceed 90 minutes.

Solution:

Using the Central limit theorem, the mean of the sampling distribution is 65.5 and the corresponding standard deviation is calculated by the formula

 \sigma \overline{x} = \frac {\sigma}{\sqrt{n}}

Standard Deviation(For sample) =  145/\sqrt{100} = 14.5

We can assume that the Z score will lie somewhere between a standard deviation of 1 and 2 that is (65.5 + 14.5) which is 80 and (65.5+2*(14.5)) which is 90. (See the graph below)

Now, I will calculate the Z score.

The Z score is calculated using

 z = \frac {x - \mu}{\sigma}

 z = \frac {90-65.5}{14.5} = 1.69

From, Empirical rule of 99.7%-95%-68% we can assume the probability to be somewhere between 13.5% and 3.50 (0.15+2.35). (See the graph below)

We can find the Probability or area under the curve using a Z table which has the probability calculated for Z values ranging from -3.49 to +3.49.

Z Table

You have to find the value P-value by looking at the left column for 1.6 and 0.09 from the Z table for 1.6 and 0.09 from the top column we get P as 0.954. Since I have found the area for +ve Z we have to subtract this value with 1. So, the probability will be (1-0.95449)= 0.04551.

Z table

You can also look at the -1.69 Z score and get the P-value directly which is exactly same as above.

So, I could say that the probability of exceeding 90 minutes is 4.55%.

Normal Distribution

Calculating Probability in SAS

To calculate the probability and the z score you can use the probnorm function in SAS as below.

DATA NORMAL;
 MU=65.5;
 SIGMA=14.5;
 Y=90;
 Z=(Y-MU)/SIGMA;
 PROBABILITY=1- PROBNORM(Z);
RUN;

 

Every week we'll send you SAS tips and in-depth tutorials

JOIN OUR COMMUNITY OF SAS Programmers!

Subhro Kar is an Analyst with over five years of experience. As a programmer specializing in SAS (Statistical Analysis System), Subhro also offers tutorials and guides on how to approach the coding language. His website, 9to5sas, offers students and new programmers useful easy-to-grasp resources to help them understand the fundamentals of SAS. Through this website, he shares his passion for programming while giving back to up-and-coming programmers in the field. Subhro’s mission is to offer quality tips, tricks, and lessons that give SAS beginners the skills they need to succeed.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Share via
Copy link
Powered by Social Snap