Data Analysis
& Activity
This activity is designed to help you understand the forces at
work behind confidence intervals. The instructions below are intended
to guide your explorations, but feel free to play with whatever aspects
of the applet call to you.
As mentioned in the Main Concepts section, the basic form of a
confidence interval is the point estimate plus or minus a constant
times the standard error where the constant is determined by the
desired confidence level. From here you see that the point estimate
determines the center, while the standard error and confidence level
determine the width of the interval. We’ll begin by looking at how
changing the confidence level changes the size of the confidence
interval, and then look at the two parts of standard error that impact
the interval width.
Begin by opening up www.stat.berkeley.edu/users/stark/Java/Html/Ci.htm
in
Internet Explorer (or other Java enabled browser). This page will
take samples from various populations and compute confidence intervals.
Initial Setup:
Select “Normal” from the “Sample From” pull down menu in
the top center
section of the window so we can generate samples from a normal
distribution. Make sure "Take sample" is done "with
replacement".
Click on the “Use True SE” button near the bottom right of the
window.
This will enable us to see what the confidence intervals look like when
we compute them in the fictional world where knowing the true value of
a parameter is possible.
We will begin with 95% confidence level so type 1.96 in
the
box near the bottom right where it says "Intervals: +/"
In the “Sample size” box change the value from 2 to 5. Leave the
“Samples to take” value at 1.
Click on the “Take Sample” button. You should then see your
first
confidence interval created along the bottom. Each time you click the
“Take Sample” button you will create a new interval that will be added
to your graph. On the left side of the window you should see a counter
letting you know how many confidence intervals you have created. At the
bottom right area of the window you should have a counter updating the
percentage of intervals that cover the population mean.
Notice that the vertical blue line represents the true
population mean.
Confidence intervals that include this value are green, while intervals
that do not include the true parameter value are red in color.
And now for the guided
questions:
1. Continue clicking the “Take Sample” button until you have
at least
25 intervals (more might be fun but don’t go so far that the graph
becomes difficult to see clearly). How wide are the intervals? Are they
all the same width? What percent of them capture the true mean of zero?
Now change the “#SE’s” to 2.58 so that we have 99% confidence
intervals. Once you change the number in the text box you can click
anywhere in the window for the new information to take effect on your
already created intervals. Now how wide are the intervals? Are they all
the same width? What percent capture the true mean?
2. Now for something not as obvious. Change the “#SE’s” back to 1.96
and set your sample size to 10, twice the size of the 5 we worked with
before. Create another 25 or so intervals in the applet. Now how wide
are your intervals? How does this width relate to the size of 95%
confidence intervals based on samples of size five? Change the sample
size to 20. Now what is the interval width? How do the three interval
widths compare?
3. Now change "True SE" to "Estimated SE" at the lower right corner.
What changed? Why? Are the results from this approach more or less
representative of what happens in practice? What does this mean for
teaching examples where each student takes their own sample and the
compares their interval to the intervals of their classmates?
4. Let’s change gears now and look at confidence intervals for the
Uniform distribution. Change back to using the "True SE".
Select “Uniform” from the “Sample from” pulldown
menu. Change your sample size to 5 and go ahead and take 50 samples all
at once this time (do this by changing “Samples to take” from 1 to 50).
Click on the “Take Sample” button. With your “#SE’s” still at 1.96,
what percent of your intervals do you expect to capture the true mean?
Was this achieved? Click the “Take Sample” button a few more times so
that you have a couple hundred confidence intervals. Did your coverage
level improve to what you expected? What’s going on here?
5. Now change the “Sample from” distribution to the “Box” option. This
allows you to enter your own data in for sampling. Click inside the
tall box at the right side of the window and enter in a list of thirty
onedigit numbers, eight twodigit numbers, and two threedigit numbers
of your choice. (Note, depending on your browser, this might take
some fiddling. Click on the "Hide Box" and "Show Box" button
(upper right corner) if you don't see the box. Now select
50 samples of size five from your new
population of values. What do you notice? What causes the large
variability in interval widths? What causes the coverage rate to be
lower than 95%? Can this be fixed the same way the coverage problem was
fixed with the Uniform distribution? Why or why not?
