

Connections
• Models
 The tests for slopes and intercepts use the tdistribution,
and have a very similar structure to tests for means. Note that,
with some algebra, one can show that the estimates for the slope and
estimates for the intercept are linear combinations of the
yobservations, and therefore the CLT applies. This is why the
tdistribution provides a good approximation for the sampling
distribution.
 This might be the most explicit example we have in the
course of a statistical model. The regression model provides both
a deterministic component (y = a + bx) and a random component
(errors are N(0,sigma)). The two added together explain how an
observation is "generated".
 Students who continue with regression will learn about
multiple regression, in which there are more than one predictors of the
same response variable. By including additional predictors,
scientists can provide what is called a "statistical control" for
variables that might affect the response variable, and potentially
isolate the effect of a single predictor while holding all other
predictors fixed.
• Inference
 Clearly, this builds on regression in Units 2 &
3. In those units, regression is treated purely as a descriptive
means. In this unit, we acknowledge that the observations have
variations, and therefore slope and intercept estimates have variation,
and so we use hypothesis tests to test whether slopes and intercepts
are nonzero.
 The hypothesis that the slope is 0 is used to test whether
two numerical variables are independent (the null hypothesis) or
associated.
