I’m working on a curriculum project associated with the Core Standards. In the high-school section on “Interpreting Categorical and Quantitative Data,” it occurred to me—as it had before when I was designing physics materials—that we really care about relationships instead of simple answers.
In physics, it was all about functions. You’re rolling a ball down a ramp. How long does it take? It depends. On what? Well, the steepness of the ramp, how far it has to roll, the moment of inertia of the ball, and so forth. We would rather have students construct the function—how the time depends on all these quantities—than simply to answer the question (12.2 seconds) for some specific setup.
When I first was working on materials for Fathom, I studied what stats education looked like. As you can imagine, very early on, we look at one variable (everybody repeat, “shape, center, spread!”) and learn about box plots and mean and histograms and mean absolute deviation and all that before we look at more than one variable. That only makes sense, right?
Well. We started playing with Fathom. One of the first things I did was get access to U.S. Census microdata. (It’s much easier now—it’s in the File/Import menu—but back then I had to work really hard to get the data.) Of course we made a histogram of age to see if it works.
And … my … eyes … started … to … close. Shape, center, and spread. Really? Who cares? But then, the Epiphany Happened. We made a second graph—marital status, to test out a categorical variable—and selected the “never married” bar.
When we took this to a class, students immediately understood what was going on, and were obviously more interested in what the graph was saying. Part of this was because it was more about them: look! there are people between 15 and 20—that’s us!—who are married!
But more deeply (I conjecture here) it was more interesting because it was showing a relationship between variables. Age is related to marital status, and that’s more interesting than either variable on its own.
So, back to the core standards. Of course they begin with univariate stuff like measures of center. But why do we care about measures of center? So we can compare: one group to another, this year to last year, whatever. So when we’re learning about median and making box plots, it’s more interesting to see two of them than one (500-case sample, 2000 Census):
Now there is a story. And there are all kinds of questions you can ask about that graph. What percentage of the women earn less than the median man? How much money is that? What does that horrifying statistic actually mean?
And this, again, is about the relationship between two variables, in this case the categorical sex and the numerical income. In an ongoing stats class, we would eventually do inference on this situation, but really, looking at this bivariate relationship is the most interesting way to learn about the univariate statistics.
The core-standards standard goes on to specify looking at relationships between categoricals (in two-way tables) and between numericals (in scatter plots). So all three possible combinations of two variable types appear. Great!
It leaves me with some observations and questions:
- Isn’t it odd that when they talk about parallel box plots, they don’t recognize that as showing a relationship between a categorical and a numerical? They always talk about that as “comparing two groups” or “comparing two data sets.” Would explicitly calling it one of three possible relationships help us organize our thinking better?
- It’s clear that a scatter plot is the best (or at least the standard) numerical-numerical graph. Parallel box plots are almost but not quite standard for numerical-categorical, but it’s not as clear. And for categorical-categorical, I think what Fathom calls the ribbon chart is the champion, with the stacked percentage bar chart coming close. But we don’t even agree what these graphs should be called! Why don’t we have a standard? (One conjecture: it’s so easy to represent cat-cat as a two-way table, that has become the default representation. Alas, people have a hard time understanding it. We need a good graph!)
- Our three settings have associated inference procedures, but they’re by no means equal in importance in intro stats:
- Num-Cat: difference of means t, randomization/scrambling
- Cat-Cat: chisquare, scrambling
- Num-Num: inference for slope? Not as common
- Similarly, we have measures of strength of association (as opposed to significance):
- Num-Cat: effect size (difference of center ÷ spread)
- Num-Num: correlation coefficient
- Cat-Cat: difference of proportion; relative risk; odds ratio; and a bevy of other plausible measures. Why isn’t there a more standard approach to learning about this combination?
And finally, the real reason I wrote this:
- Is it really true that we could usefully organize our curricular thinking around relationships? Or are we missing something important by skipping a purely univariate treatment?
Note that in my vision, you still need to learn univariate techniques—you just do so in a context where you’re comparing them to other centers and spreads.
In a different project, another writer gave me a counterexample in this problem situation:
Natalie’s mom discovers that she texts 200 times per month. She’s astonished and horrified, and demands that her daughter slow down. Natalie tells her mom that 200 texts a month is not really a big deal, that she’s not unusual. They collect data from other kids and find the purely univariate distribution. (Here the problem supplies the data.) They learn that Natalie is in the second quartile after all.
Is this an unusual exception to the “rule” that everything interesting is about a relationship? Can we ignore it? Or is it the tip of an iceberg I just haven’t been looking at?