Time for a curriculum discovery! This may be old hat to others, but hey, it was new to me and I was very pleased with the idea. I’ll explain it, tell what is happening with it in the classroom, and muse briefly on the philosophical consequences. Onward!
You know those sand timers that you get in games? The ones that go one minute, or two, or three and you use them to time your turn? Our assistant head got a box of them, and has been offering them to us faculty for a while in case we wanted to use them in team-building meetings to help regulate turn-taking, i.e., keep us from running off at the mouth about our own precious problems.
She still had a lot left, and I realized they could be a great solution to a problem that has been in the back of my mind: how to teach about variability.
The nub: there are many sources of variability. Here are two:
- If you measure the same thing repeatedly, you get different results.
- If you measure different things, even if they’re supposed to be the same, you get different results.
The first has to do with the process of measurement, and the inevitable inaccuracies that result. The second has to do with genuine variation. It would be great to have a touchstone activity we could refer back to throughout the course when we want to make that distinction.
The trick is to find something that has the right characteristics. I had tried to have them measure the distance from our class to the wood shop under trying and variation-inducing conditions, but those data, as interesting as they were, didn’t quite fit the bill. Enter the sand timers, with the added bonus that we often measure distance—so measuring time is a treat.

Here’s the plan:
- each group gets a sand timer and a stopwatch
- they time the sand timer repeatedly—at least 7 times—so they can make a decent box plot of the times they get.
From there on out the plan was vague, but I hoped that the variation in measurements within the group, i.e., with the same timer, would be less than the between-group variation. Then I could ask, for example: Consider using your timer; what range of times will it plausibly produce? Now consider putting all the timers in a box and pulling one out: what range of times would that process plausibly produce?
That’s the beginning of a lesson plan. Not being a total noob, I realize that I have to collect data ahead of time to see if this vision of the data is really true. Don’t want to be surprised, after all.
So I get a timer and a stopwatch and of course the instant that happens a whole lot of other ideas occur: How do you tell when the last grain falls? What about the variation in turning the thing over and starting? What about the variable of who is pressing the button? (All of these are part of measurement error, but still: leads us to think about experiment design. I anticipate this being part of the discussion.) And of course, when the sand is getting near the end, what does tapping the timer do? Why, that could be an experiment on its own!

Anyway, I start collecting data and plotting it in Fathom. The first 4 points are in two groups of two, not unusual; rolling two dice sometimes you don’t get any 7s for a while. I take a few more points, expecting the middle to fill in. It doesn’t. After 7 points, there are still two groups; it’s kinda bimodal. See the graph at right.
This pattern persists until I realize, crikey, the pattern is too persistent. So I plot the values as a function of the run number (in Fathomspeak, the case index). And the pattern is clear: It alternates. (The aberration between index 5 and 6 is from when I missed the timing and did not record a value.) Here you go:

That is—and isn’t this delicious?—this timer is about a second and a half faster going one direction than the other.
Now we have to check out a different timer, hoping against hope that it’s “more different.” And it is, hooray:

Not only that, the new timer is even more dramatically different: 4 seconds different one direction as opposed to the other. So not only do these apparently-identical timers vary by more than 10 seconds (in a minute and a half) from one timer to the next: they vary a few seconds from one direction to the next, and less than a second if you pay attention and go the same direction every time. What’s more, it’s clear that even with students, the button-pushing error will be much less than a second (although we can test that in class too, and we did) so we can make a case that even though there’s variability galore, we can use the data to make reliable conclusions about the nature of these timers. (And get an advantage the next time you play Perquackey…)
I am, needless to say, thrilled. But does all this richness disqualify the activity as an introduction to variability? I mean, if they don’t understand variability to start with, and I’m looking for a way to show them the usefulness of measures of spread such as IQR, should I look for something simpler and cleaner?
I decide: no way. This is too cool to postpone. (At this point, astute readers should be cataloguing all the things that can go wrong.) Furthermore, I realize, stopping would be falling prey to what I have called the “false prerequisite,” like when you say that you can’t investigate something or other because you don’t understand exponential growth. The kids may well return to this activity and its data when they study more advanced topics, but I ought to be able to make this make sense to these kids who have just learned about box plots and IQR.
Regarding the alternation, I wonder, will they even notice? If not, I think, I’ll leave it out, leave it for discovery on a later day.
What actually happened in class? Stay tuned. But before we go, what did I learn (or re-confirm)?
- Trying stuff out yourself always yields insights.
- Real data always comes through with something interesting.