ENGR3303 Engineering Probability and Statistics Project 3: Central Limit Theorem
In Project 1, we learned how to build a population of virtually “manufactured” objects in Excel. We will use this
information to build a population and draw multiple samples from it. Your results will be copied into a Word
document, and you will submit both the Word file and the Excel file into a dropbox on D2L. Exercise 1 – Build a Normal Population a)
Let’s begin by building a population of normally distributed 100 resistors on one of the pages in an Excel
workbook. Excel has four functions dealing with the normal distribution: norm.dist(x,mu,sigma,cumulative)
This produces the probability associated with a particular x value that is normally distributed with a mean mu
and standard deviation sigma. The last parameter is either TRUE for a cumulative distribution (cdf) or FALSE for a probability distribution (pdf). norm.inv
(probability,mu,sigma) This is the inverse normal function that produces an x value corresponding to a given probability for a
distribution with a mean mu and standard deviation sigma. norm.s.dist(z,cumulative)
This is the standard distribution, which produces the probability for a particular z value. The last parameter is
either TRUE for a cumulative distribution (cdf) or FALSE for a probability distribution (pdf). Recall that the
standard distribution has a mean of zero and a standard deviation of one, so we can also generate a standard distribution using norm.dist(x,0,1,cumulative).
norm.s.inv(probability) This is the standard inverse distribution, which produces a z value for a corresponding probability. It is equivalent to norm.inv
(probability,0,1). If X is our random variable representing the resistance, we want to generate a set of x values that are normally distributed around x=100
, and we will use a standard deviation of 6.00 to get a reasonable spread in our
data. So we want to use the norm.inv function to generate our x values from a random probability found from the rand
() function. Let’s make a population of 8000 on the first spreadsheet in your workbook. So that we can draw samples, we will index each resistor with a number.
As we found in Project 1, every time we make a change on the spreadsheet, these random numbers will
regenerate. We can avoid having the numbers change by copying and pasting the values of the randomly
generated population into the column next to the randomly generated values. Do this. Each student should have
a unique set of resistors in their population, so if you start out working with someone, you might want to copy two columns, one for you and one for your partner.
Rename the spreadsheet “Pop1”.
Find the mean, variance, and standard deviation (using Excel’s built‐in functions) of your population. Plot a histogram of your population. b)
Next, let’s grab some elements out of the population to create a sample (make sure you grab it out of the fixed
population and not the one that continues to be randomly generated). We’ll use the same approach that we
used in Project 1 using the RANDBETWEEN and VLOOKUP functions.
Let’s start with a sample of 35 elements. Put this in a column on the second tab of your spreadsheet (rename the
tab samp1) leaving 5 blank cells at the top. In the cells above the sample, compute the sample mean, variance, and standard deviation using Excel’s built‐
in functions. Plot a histogram. Comment on how the sample values and histogram compare to the corresponding population values. c)
Now construct 39 more samples (N=40 altogether). Place these in the columns to the right of the first sample.
Compute the means, variances, and standard deviations of these additional samples. You do not need to plot histograms of these additional samples. d)
We now have means, variations, and standard deviations for our collection of samples. The Central Limit
Theorem says that the set of sample means should form a normal distribution with a mean X X , variance 2 2 X X n and standard deviation X X n
. Let’s see if this is true. First, plot a histogram of the set of sample means. Next, compute the mean, variance, and standard deviation of
the set of 40 sample means. Compare and comment on how these compare to the values predicted by the Central Limit Theorem. e)
Now, by copying and pasting, go back and adjust N to see if you can get better agreement. (You can also play
around with n, but we generally want to keep this as small as possible for convenience, so don’t make it too big.) Comment on whether this improves the comparison.
Exercise 2 – Comparing more distributions In Exercise 1, our population was normally distributed and the sample means were normally distributed,
although with a different standard deviation. The Central Limit Theorem tells us that the sample means will be
normally distributed no matter what distribution the population has. Verify this with the distributions below.
You should be able to copy the spreadsheets that you used for the normal analysis and then copy and replace the appropriate changes in the formulas.
Note that the parameters in these formulas should point to cell addresses, just like in the normal distribution, and the probability will be generated by RAND
() just as it was in the normal case. Also, the parameters used in
Excel formulas (and those of other programs, tables, etc.) may be defined slightly differently from one source to another, so be careful when you interpret results.
a) The lognormal distribution: p=LOGNORM.DIST(x,mean,std,cumulative) x=LOGNORM.INV(probability,mean,std) Use a mean of 0.3 and a standard deviation of 0.1. b)
The exponential distribution: p=EXPON.DIST(x,lambda,cumulative)
There is no built in inverse to the exponential distribution, but we can create the formula that will compute it as follows: () x f x p e x p e
ln ln x pe ln p x 1 ln ln( / )/ p xp x=-1*ln(p/lambda)/lambda
Use a lambda of 4.000. c) The gamma distribution: p=GAMMA.DIST(x,alpha,beta,cumulative) x=GAMMAINV(probability,alpha,beta)
Use an alpha of 100.0 and a beta of 6.00. d) Comment on whether the Central Limit Theorem’s predictions are demonstrated in your results.
Submit your Excel file and a Word document to the drop box by the specified date. In your Word file, report your
results in the order requested. Any graphs that you were asked to make should be copied and pasted into the
document along with requested comments. You may copy computations as tables from your Excel file, as well.