Unit 9: Sampling Distributions

Home | Contact us   
  Main Concepts | Demonstration | Teaching Tips | Data Analysis & Activity | Practice Questions | Connections | Fathom Tutorial | Milestone 


 Main Concepts

• In order to draw conclusions about a population based on a sample and understand how our estimates of certain parameters might differ from the true values, we need to know how our estimators might vary from sample to sample. Sampling distributions tell us how statistics vary from sample to sample.

• There is a crucial distinction to be made between statistics (functions of data) and parameters (which characterize probability distributions). Statistics are observable (because we get them from our data). Parameters are often unknowable, because they "belong" to populations. We will use the statistics of the sample to estimate the parameters of the population.

• Statistics, because they are based on data and, ideally, data are generated by some sort of random process, will vary from sample to sample. This means that statistics are random variables, and therefore have their own probability distributions. This is a Big Idea and a Very Difficult Concept. The probability distribution of a statistic is called the sampling distribution.

• Sampling distributions are very abstract, which is partly why they are difficult. What makes sampling distributions somewhat abstract is that they do not describe variability across a sample, but variability from sample to sample.

• The standard deviation of a sampling distribution is more specifically called the standard error.

• One of the Big (and Beautiful) Ideas of Statistics is the Central Limit Theorem. Loosely, if multiple samples of size n were drawn randomly and independently from a population, then the histogram of the means of those samples would be approximately normal.

• In theory, the sampling distribution can often be figured out mathematically based on probability theory. In practice, this can be quite difficult or impossible for some statistics. The Central Limit Theorem is one way of providing an approximate sampling distribution for some statistics, but it is not universally applicable. Simulating a sampling distribution can also be helpful.

• The Central Limit Theorem is one of the major concepts of Statistics and is seductively useful. However, it is not a panacea. There are some statistics for which the sampling distribution is not approximately normal, no matter how large the sample size. And there are some populations that are so non-normal that astronomically large sample sizes are required before the sampling distribution is approximately normal.

• Be aware that although the books dwell on sampling distributions of sample means and sample proportions, there are as many sampling distributions as there are statistics. We feel it is important for both you and your students to look at sampling distributions distinct from the Central Limit Theorem. It's not that we think the sampling distribution of, say, the sample maximum is useful information; but the activity of examining it helps us learn about the more general concept of sampling distributions.