I don’t quite know how Beth does it! We’re using Beth Chance and Allan Rossman’s ISCAM text, and on Thursday we got to Investigation 1.6, which is a cool introduction to power. (You were a .250 hitter last season; but after working hard all winter, you’re now a .333 hitter. A huge improvement. You go to the GM asking for more money, but the GM says, I need proof. They offer you 20 at-bats to convince them you’ve improved beyond .250. You discover, though the applets, that you have only a 20% chance of rejecting their null, namely, that you’re still a .250 hitter.)
I even went to SLO to watch Beth Herself run this very activity. It seemed to go fine.
But for my class, it was not a happy experience for the students. There was a great deal of confusion about what exactly was going on, coupled with some disgruntlement that we were moving so slowly.
A number of things may be going on here:
- Almost all of the numerical distributions we’ve seen so far are sampling distributions. Since these are by their nature unreal—you never actually see a sampling distribution for real, you only simulate or imagine them—I wonder if that’s getting to the students.
- I’m uneasy about 1.7, which introduces the Normal. I’m trying to avoid the Normal, especially if students’ first exposure to spread is
. This inner conflict maybe transmitting itself to the class as I scramble, internally, to decide whether to follow Beth-as-written, skip around carefully, or insert some of my own stuff.
- We’re using Fathom, whereas they use R studio with a carefully constructed workspace. Fathom definitely does everything R does up to this point, but there may be some trouble because of (a) my students are constructing all of their simulations from scratch (see my post on the topic on another blog, and check out that blog); and (b) Fathom is gradually becoming more crippled as technology advances and it has no support. We can only hope it gets some.
- Beth and Allan have a great deal of experience, and deep knowledge about exactly why they present things in the order they do. I could easily be leading students astray, unconsciously, by giving some ideas the wrong emphasis—either because I really want to, for example, talk about sample size earlier, or because I don’t actually have the stats background they do.
Anyhow, here’s my thinking right now (early Sunday afternoon; next class is Tuesday morning):
I want to introduce the Normal before I go on to 1.7. See below for a discussion. But I want to do so with a little more background. We need SD as a measure of spread. They need to see then that the SD of a sampling distribution is smaller than the data SD. In fact, I want to see how the SD of a sampling distribution decreases with . For that to make sense, it won’t hurt to connect it to binomial through a random walk, and seeing how variance increases linearly with
.
Only then will I feel OK about saying, take 1.95 (or 2) SDs for your confidence interval.
To get there, it might help to take a session or two and devote it to “lab”: in this case, using Fathom to make simulations of various things. We will end up generating Normal and non-Normal distributions, beefing up the probability end of things, and learning some of the procedures that connect data to sampling distributions, simulation to reality, samples to measures, and so forth.
About the Normal
I want to delay the Normal as long as possible; my intuition screams that once you introduce the Normal distribution, it becomes the one way to do everything easily, and calculations become more rote and rely less and less on understanding.
When I asked Beth, she said they did it because, among other things, “percentiles are too hard.” There I think Fathom has the edge; you can do percentiles in Fathom, and I’ve seen Lick regular stats kids use them effectively.
But really, here, the issue is not Normal vs nonparametric; it’s percentile vs . With the recent lambasting of the bootstrap, and given that I’ve kind of vowed not to do my own thing this semester (in order to save my sanity), I do have to include the Normal distribution in this course. Besides, it arises naturally from the CLT: in sampling distributions of the mean, and less convincingly as an approximation for the binomial.