• Exploratory Data Analysis
- Like confidence intervals, hypothesis tests help students
use data to just answer that simple investigative question that they
may have developed about a particular population. This should reference
back to the first unit of the course.
- In order to evaluate whether assumptions hold, need to look
at shape, center, spread of sample distribution.
- Like confidence intervals, hypothesis tests help us make
inference about a population using just a single sample. Just as the
previous units discussed, it is important to reiterate that we just
need one sample to make inference.
- Again like confidence intervals, hypothesis tests rely on
sampling distributions and the Central Limit Theorem (if assumptions
are met), so it is important to show students how these concepts all
- For assumptions to be met, we need to follow good data
collection practices to obtain a random sample.
- If the Central Limit Theorem applies, students will need to
recall normal models and z-scores. Remember: their sample may not be
normal, but the sampling distribution of their sample mean or
proportion is normal under Central Limit Theorem.
- It is very important to draw students back to informal
inference done earlier using simulations; show them the same example
using both methods so they can see that both achieve same goal; one is
just a shortcut/approximation that might be faster to do.
- If students can understand the intuition behind
simulation-based inference, then formal hypothesis testing is just a
shortcut we can use when assumptions are met that does the same thing
with less work.
- p-values are just how likely we are to get the observed
statistic if we assume a certain model/hypothesis is true; this is the
case whether we use the normal model (and look at a shaded region
beyond the observed value) or use a model built from many simulations
(and count the observations beyond the observed value).
- In this course, we assume sampling distribution follows a
model with some hypothesized mean.
- But no matter what, we assume some “chance” model that is
plausible for our estimate.
- It is plausible that our estimate comes from this model,
but how plausible given the mean of that model? If it’s not very
plausible, then we should look for a new model.
- The p-value is just a conditional probability; given the
null hypothesis, what is the probability of getting the observed
statistic (or something more extreme)?