Unit 4: Collecting Data
|Main Concepts | Demonstration | Teaching Tips | Data Analysis & Activity | Practice Questions | Connections | Fathom Tutorial | Milestone|
Main ConceptsThis is a difficult unit. Not only is it filled with vocabulary, but it has some subtle concepts. Our approach -- which we recommend -- is to focus on experiments vs. observational studies, and discuss the basic forms of surveys. In Unit 15, after studying probability and getting experience with estimation, we'll return and discuss some of the more subtle refinements of experimental design, such as blocking and stratifying.
• Designing an experiment is complex and probably won't be understood until after completing the unit. Mastery won't be achieved until later sections that provide a more complete context are covered. We will return to this topic later in the course.
• Our primary goal is to determine the effect of a treatment variable on a response variable. We want the treatment groups to be as similar as possible to one another in all respects except for the values of the treatment variable. Only if this is achieved can we conclude that differences in the response variable between the groups is due to the differences in the treatment variable.
• When comparing groups we are looking for differences in their "typical" response (usually), and this is complicated by the variation in the response variable within each group. Variation has various sources, and blocking -- which divides the units into homogeneous groups -- minimizes variation due to one particular source. (Look back to the Demonstration in Unit 1, in which children were paired, a form of blocking.)
• The only way to make inferences about a population based on a sample taken from that population is to make sure your sample is representative of the population. The only way to do this is to take a random sample from the population. There are various techniques to take a random sample. For example, the most basic is the simple random sample (SRS) in which items are selected from the population at random but without replacement. A refinement of this technique is stratification, in which the population is divided into strata, and a SRS is taken from within each strata.
• The Big 3: Randomization, Repetition, and Control!
• Researchers try to control for every conceivable confounding variable, but in practice there is no way to guarantee that every existing confounding variable has been considered and controlled.
• Simple random samples are very difficult to achieve in any but the most simplistic situations.