Too long again since the last post.
Here we have something interesting that’s outside the narrative thread. On the AP Stat list serve, Chris Talone asked this question:
Is there a way to set up a Fathom simulation to illustrate how the slope of a line of best fit will vary when choosing ordered pairs from a population of ordered pairs? My students are having a hard time understanding the purpose of the linear regression t-interval and the linreg t-test. I would like for them to see how the slope can vary depending on the sample of points chosen. Ideally, I’d like to set up a population of ordered pairs, graph a scatterplot and find the line of best fit for the population, then have Fathom randomly select 2, 5, or 7 of those ordered pairs, graph a scatterplot of the sample chosen, find the line of best fit for the sample chosen, and also plot the sample slope on a dot plot, and then repeat many many times….
I posted a response there, but we can’t give illustrations. We can here! This is where we’re heading:
How do we do this in Fathom? Read on…
Step By Step
1. Set up your source collection, the ordered pairs.
3. In the Sample collection, set up your measures:
- Make one called n (for the sample size); its formula is count( )
- Make one for the slope you want to calculate, call it slope if you wish; formula: linRegrSlope( predictor, response ), where predictor and response are the names of your attributes.
4. Collect measures! (You now have three collections: your source, the sample, which changes, and the measures collection)
5. Make a dot plot of slope. This is the sampling distribution of slopes for a sample size of 2.
6. Change the sample size (in the inspector for the sample collection) and collect more. But you want to separate them by sample size….
7. Drag n to the “other” axis of the dot plot, holding down shift. This will split the plot categorically by sample size, so you can see how the spread of the sample slope depends on n.
- If you sample without replacement, students can see how the sample slopes are all the same, i.e., the population slope.
- If you sample with replacement (default, shown) you have a bootstrap distribution for slope. If you find (for example) the 5th and 95th percentile of these values, you have a 90% bootstrap interval for the poulation slope.