

Connections
• Exploratory Data Analysis
 Earlier we did informal examinations to determine whether a
distribution was Normal. The Chisquare procedure provides a
formal test for whether a distribution of categorical variables has a
specific shape. There are better ways of doing this, but by
treating the bins of a histogram as categories, this could provide a
means for testing whether a histogram is Normal.
• Comparing Groups
 The chisquare test provides a mean for testing whether two
categorical variables are associated. Later you will see that we can
test whether two numerical variables are associated by testing whether
the slope in a linear regression is 0.
• Inference
 The chisquare statistic's sampling distribution is only
approximated by the chisquare distribution, and this approximation
might not always be too strong (particularly for small sample
sizes.) In some cases, "exact" tests, such as Fisher's Exact
Test, can be used. These are procedures  deemed impractical
before the advent of fast computingthat provide the exact pvalue,
regardless of sample size.
 The chisquare tests has extensions beyond testing a
particular value of a parameter, unlike previous hypotheses tests.
Still, students should be in the habit of writing null and alternative
hypotheses and checking assumptions.
 As noted in the Teaching Tips, note the connection between
the goodnessoffit test and the oneproportion ztest and between the
test of homogeneity and the twoproportion ztest.
• Models
 The chisquare distribution with 1 degree of freedom is the
distribution you would get if you took a standard normal random
variable and squared it. Imagine this experiment: you want to see if
eye color (dark/light) has a relationship to whether students wear
corrective lenses. You can do a chisquare test of independence and you
will get a chisquare statistic. You can also do a ztest to compare
two proportions (using pooled proportions): is the proportion of
lighteyed students with corrective lenses the same as the darkeyed
with corrective lenses? Take the zstatistic and square it  you get
the chisquared statistic. Both statistics have the same pvalues. You
can show algebraically (should you wish) that these two test statistics
are the same, but of course it only works for 2X2 tables. Want to see
an example?
• Probability
 As in the previous two units, remember that the pvalue is
just a conditional probability; given the null hypothesis, what is the
probability of getting the observed statistic (or something more
extreme)?
