Many data-oriented lessons in math and science ask students to come up with the question they want to investigate. The idea behind this is great. It ought to promote buy-in: students will want to use data to explore something they’re interested in—so we get them to come up with a question they want an answer for.
I have no trouble with that.
But I recently saw a lesson which was, essentially,
- Look at this table of data
- Come up with a question about the data
- Make a prediction
- Make a scatter plot
- Put a line on it
- Answer your question
The teacher materials even had a section for what do do if students have trouble coming up with a question, with suggestions such as asking, “how do you think the variables might be related?” Fine prompt. But how does it actually elicit a question about a table of data the student has only just seen?
Since this is in a tech-rich environment, it might be better to approach (numerical) data like this:
- Plot two variables against each other
- Look at the patterns that emerge
- See if they make sense in context
- Look for surprises
- Repeat until you find something cool
- Analyze the relationship by making lines and all that
At this point, you could retrospectively come up with a question, but chances are good that the question—since it really wasn’t a question you were interested in to begin with—will be lame.
I have batting records for MLB hitters from 2009. It has games, at-bats, hits, doubles, triples, homers, all that stuff. What questions do you have about relationships among the variables?
Neither do I. I have some interest (and let’s assume that you do too), and specific questions (e.g., where is Bengie Molina?), but no relationship questions. Instead, I just want to look and see what I find.
So I make a bunch of scatter plots. I make up stories about why they’re shaped the way they are, without having any questions in my head. Here’s one of the plots:
I’ve even put a relevant line on it. (What does the slope mean? That’s an interesting question—but not the kind of question the lesson is talking about…)
Instead, the school lesson wants a kid to write, “How is the number of at-bats for an MLB player associated with the number of games they play in?”
Do you feel the lameness? This question arises only because we have to come up with a question for a school assignment. We’re interested in the relationship, sure, once we see it, but wasn’t a question we had before we saw the data.
This is why I like the idea of having kids mess with data and then make a claim they can buttress with data. I describe that here.
As to questions, looking at the scatter plot, the big question for me—which I would never have come up with beforehand, only after messing with the data—is, “who are those players at the bottom with a bunch of games and very few at-bats?”
Because you don’t have the data in front of you, I’ll at least identify the guy at the far right of that plume of points: “Perpetual” Pedro Feliciano, Mets, 88 games, no at-bats.
and we’re back…
So there are at least two kinds of legitimate questions:
- Sometimes you have a genuine question even before you get data. That should propel you to get data and answer it.
- In messing around with data, you get curious about the data you see and wonder about the shape or specific features. That creates questions: Who are those guys? What does the slope mean?
These are very different kinds of questions. I think we can expect students to have the second, even insist on it. But it will be more unusual for them to have the first. And it will help if we don’t force them to ask a kind of question they don’t really have.
Using “claims” is one approach that doesn’t seem to have that problem; I wonder what others we could find.
Questioning Research Questions
On a related subject, grant proposals and dissertations often seem to require “research questions,” which are like the first kind. I think many of them are asked, well, not retrospectively but justificationally, like this:
We want support to create cool educational solution S. We have some evidence that S works already. We need more evidence to get support, though, so we look around for a measurable effect M (ideally with a rich literature) that S might cause. But to get approved, we need a question. So we ask, “Does S result in M?”
This is not what we actually care about. And this effect M is probably not what we think matters most. But it’s measurable, and we can dig up plenty of citations.
Makes me wonder if there’s an alternative system.