If we replace the old Normal-distribution-based paradigm with a new one featuring randomization, we happily shed some peculiar ideas. For example, students will never have to remember rules of thumb about how to assess Normality or the applicability of Normal approximations.
This is wonderful news, but it’s not free. Doing things with randomization imposes its own underlying requirements. We just don’t know what they are. So we should be alert, and try to identify them.
Last year, one of them became obvious: stability.
(And it appeared recently in something I read, I hope I find it or somebody will tell me so I don’t think I was the first to think of it, because I never am.)
What do I mean by stability? When you’re using some random process to create a distribution, you have to repeat it enough times—get enough cases in the distribution—so that the distribution is stable, that is, its shape won’t change much if you continue. And when the distribution is stable, you know as much as you ever will, so you can stop collecting data.
Here’s where it arose first:
It was early in probability, so I passed out the dice. We discussed possible outcomes from rolling two dice and adding. The homework was going to be to roll the dice 50 times, record the results, and make a graph showing what happened. (We did not, however, do the theoretical thing. I wrote up the previous year’s incarnation of this activity here, and my approach to the theory here.)
But we had been doing Sonatas for Data and Brain, so I asked them, before class ended, to draw what they thought the graph would be, in as much detail as possible, and turn it in. We would compare their prediction to reality next class.