Unit 13: Chi-square Tests

Home | Contact us   
  Main Concepts | Demonstration | Teaching Tips | Data Analysis & Activity | Practice Questions | Connections | Fathom Tutorial | Milestone 



Exploratory Data Analysis
  • Earlier we did informal examinations to determine whether a distribution was Normal. The Chi-square procedure provides a formal test for whether a distribution of categorical variables has a specific shape. There are better ways of doing this, but by treating the bins of a histogram as categories, this could provide a means for testing whether a histogram is Normal.
Comparing Groups
  • The chi-square test provides a mean for testing whether two categorical variables are associated. Later you will see that we can test whether two numerical variables are associated by testing whether the slope in a linear regression is 0.
  • The chi-square statistic's sampling distribution is only approximated by the chi-square distribution, and this approximation might not always be too strong (particularly for small sample sizes.)  In some cases, "exact" tests, such as Fisher's Exact Test, can be used.  These are procedures -- deemed impractical before the advent of fast computing--that provide the exact p-value, regardless of sample size. 

  • The chi-square tests has extensions beyond testing a particular value of a parameter, unlike previous hypotheses tests. Still, students should be in the habit of writing null and alternative hypotheses and checking assumptions.

  • As noted in the Teaching Tips, note the connection between the goodness-of-fit test and the one-proportion z-test and between the test of homogeneity and the two-proportion z-test.
  • The chi-square distribution with 1 degree of freedom is the distribution you would get if you took a standard normal random variable and squared it. Imagine this experiment: you want to see if eye color (dark/light) has a relationship to whether students wear corrective lenses. You can do a chi-square test of independence and you will get a chi-square statistic. You can also do a z-test to compare two proportions (using pooled proportions): is the proportion of light-eyed students with corrective lenses the same as the dark-eyed with corrective lenses? Take the z-statistic and square it -- you get the chi-squared statistic. Both statistics have the same p-values. You can show algebraically (should you wish) that these two test statistics are the same, but of course it only works for 2X2 tables. Want to see an example?
  • As in the previous two units, remember that the p-value is just a conditional probability; given the null hypothesis, what is the probability of getting the observed statistic (or something more extreme)?