or, What I should have emphasized more at NCTM
I’m just back from NCTM 2018 in Washington DC where I gave a brief workshop that introduced ideas in data science education and the use of CODAP to a very nice group in a room that—well, NCTM and the Marriott Marquis were doing their best, but we really need a different way of doing technology at these big conferences.
Anyway: at the end of a fairly wide-ranging presentation in which my main goal was for participants to get their hands dirty—get into the data, get a feel for the tools, have data science on their radar—it was inevitable that I would feel:
- that I talked too much; and
- that there were important things I should have said.
Sigh. Let’s address the latter. Here is a take-away I wish I had set up better:
In data science, things are often too complicated. So one step is to simplify things; and some data moves, by their nature, simplify.
Complication is related to being awash in data (see this post…); it can come from the sheer quantity of data as well as things like being multivariate or otherwise just containing a lot of stuff we’re not interested in right now.
To cut through that complication, we often filter or summarize, and to do those, we often group. To give some examples, I will look again at the data that appeared in the cards metaphor post, but with a different slant.
Here we go: NHANES data on height, age, and sex. At the end of the process, we will see this graph:
And the graph tells a compelling story: boys and girls are roughly the same height—OK, girls are a little taller at ages 10–12—but starting at about age 13, girls’ heights level off, while the boys continue growing for about two more years.
We arrived at this after a bunch of analysis. But how did we start?