## Trust the Data: A good idea?

When we last left our hero, he was wringing his hands about teaching stats and being behind; we saw a combination of atavistic coverage-worship and stick-it-to-the-man, can-do support for authenticity in math education. The gaping hole in the story was what was actually happening in the classroom. The plan in this post is to describe an arc of lessons we’ve been doing, tell what I like about it, and tell what I’m still worried about. Along the way we’ll talk about trusting the data. Ready? Good.

You know how students are exposed to proportional reasoning in Grade 5 or earlier, and they spend most of their middle-school years cementing their this essential understanding? And how, despite all this, a lot of high-school students—and college students, and adults—seem not to have exactly mastered proportional reasoning?

I figured this was likely to be the case in my class, so when someone showed me the Kaiser State Health Facts site, I jumped right in, and pulled the class in with me. In it, you find all kinds of stats, for example, this snip from a page about New Mexico:

When you see something like this, you can’t make sense out of it until you know more, for example, what does the 96 mean? You have to look more carefully at the page to discover that it’s “per 100,000 population.” And nowhere do you see that it’s also “per year.”

But once you decode it, you can answer some questions. An obvious one is, “how many teenagers died in New Mexico that year?” Before we jump into proportions, though, let’s point out that this is probably not a very interesting question unless you live in New Mexico, and maybe not even then.

So I just did one quick example in front of the kids, and then the assignment was to spend at least 15 minutes on the site, finding some rate of any interest at all, decode it, and report one calculation you can make. We started in class. Kids found things that interested or horrified them. Abortion, pregnancy, and STD rates figured prominently.

For example:
Continue reading Trust the Data: A good idea?

## Shakespeare, Cervantes, and Bush: Can you tell them apart statistically?

There’s a great activity at the beginning of Workshop Statistics where kids write a couple of sentences about why they’re taking the course, and then construct the distribution of word lengths. The main point is to ask, “are all word lengths the same?” Answer: no, duh. Right: they vary. It’s not that an individual word changes its length, but that the idea word length varies from word to word. So it’s a variable, in a way that’s a little different from the variables they’re used to from algebra.

But what does the distribution look like? Rather than look it up, I found Hamlet’s “to be or not to be” soliloquy online, pasted it into my favorite text processor (TextMate) and did a bunch of global substitutions so that every word was on its own line. (I also stripped out hyphens and apostrophes and other punctuation, which may not always be appropriate, but never mind. But I think of ’tis as a three-, not a four-letter word.) Then a quick dump into a Fathom collection, and a new attribute (or variable) with a formula like stringLength(WORDS) and you’re all set. This process takes enough fluency that it’s an inappropriate activity for the kids in my class, at least, but the results are interesting enough to share, as in the illustration at right.

## Survived Day One

Whew.

I have been agonizing (all my friends and family will corroborate) about what to say and do on the first day. A lot of the worry has been about how to set the right tone. If we’re going to try to play the whole game (Perkins; Buell), I want the kids to know right away the game we will be playing. Which meant I had to decide what that was: what do I think is most important for them to learn?

Well. I survived the first class. Some things went better than others. But I want to acknowledge here two good decisions I made.

Background: the whole school is going through a re-writing of the mission statement. There is even a mission statement task force, on which I thankfully do not serve. But despite how horrendously dreary and time-wasting mission statements can be, I was surprised that I actually like the new mission statement a lot. It accomplishes its purpose well. With that in mind, could I capture what’s important in a couple of sentences? Make a stat class mission statement?

Here’s what I came up with after thinking about a lot of different (and probably equivalent) ways to carve up the territory:

In this class, you will

• Learn to make effective and valid arguments using data
• Become a critical consumer of data

## Clap Speed Follow-Up

In this post, we saw Kent “Toast” French, the world’s fastest clapper, clap at a rate he claimed to be 14 claps per second. I said I thought I could use WireTap Studio to look at the data. Sure enough, it works; here is a screen shot of part of the audio. I get more like 13 cps, or maybe a little less. I have not looked through the whole sequence to see if he ever hit 14.

It would be lovely to use something like Fathom for the whole clip so we could calculate each interval and see how that changes over time.

## Reaction Shot: Learning the Rules to a Game

Riley Lark just posted the first “Flunecy” episode; the last paragraph reminds me of game rules. He asks:

how do you decide where to draw the line?  When do you say “this is fundamental and we need to understand it before we move on,” and when do you say, “you can sort of see how this works from this picture; now let’s move on?”

I wonder how parallel this is to learning to play a board game or a card game?

Usually somebody is there who knows the rules, and you hear the basic idea and a few tips, and everybody agrees that we’ll all start playing and learn as we go. When is that sufficient, and when do you actually have to read the rules?

I think that reading and internalizing rules is an interesting skill. Does that skill help with mathematics—or is it just something (like number theory, Riley might agree) that gives you formal underpinnings but is not really essential to becoming mathematically powerful? Don’t know.

I made some curriculum having to do with this. Like many math teachers, I like NIM games, but I’ve gotten tired of explaining the rules. So I made NIM problems where groups also have to learn the rules without prior explanation.

This is one of those cooperative-learning deals where each group gets 4–6 cards that they deal out; each member can look only at their own cards; they can share the information; the group has a problem to solve. You’re probably familiar with the format.

In this case, the problem is to learn to play the game and figure out how to win. The image at right links to a pdf. Print it out, cut it up, and pass out the cards. Seems to work pretty well.

This is from United We Solve; I learned this NIM variant in Winning Ways.

## WCYDWT: World’s Fastest Clapper

You may have seen this, but I look at it as:

• A chance to see if I can embed video
• An homage to dy/dan
• A way to thank the original poster, Simon Job, of Oz, who won the Annual Report contest for 2008.
• A way to poke the Fathomistas to see if we can ever get audio from MP3s or on-board mikes to act like a Vernier microphone so we can collect the data!
• A really interesting measurement task. And I love measurement.

One thing I love about this is that it immediately evokes Dan Meyer’s perplexity: did he get the rate he claims? Another is that it’s a clear win for technology. And another is that you just know that kids will (a) be impressed and (b) want to try it themselves.

Here is Simon’s original post with this video, which includes a link to an actual lesson and instructions for using the technology he used.

Me, I’m gonna try WireTap Studio, my audio-gleaner of choice. I’d look for something free, but this I already have.

## Outliers in the NYT: Reflections on normality

I need a good system to deal with those moments when you’re reading the news or listening to NPR and they bring up something that could fit into an actual lesson, connecting math to everyday life. This probably happens more when thinking about teaching stats than with other areas of math. Of course I have thought of clipping the article, and I have several folders on my computer, but I can never find them. Here is another attempt: blog about them! And we get a new category, Data in the News.

Onward! Yesterday’s NYT prints what appears online as a blog post by Carl Richards. It makes the point that we often assume erroneously that everything is normally distributed (yay!) and that this affects our expectations about, for example, investing. The outliers, he says, are much more salient than we think they would be. And then we get this delicious passage:

If you take the daily returns of the Dow from 1900 to 2008 and you subtract the 10 best days, you end up with about 60 percent less money than if you had stayed invested the entire time. I know that story has been told by the buy-and-hold crowd for years, but what you don’t hear very often is what happens if you were to miss the worst 10 days. Keep in mind that we are talking about 10 days out of 29,694. If you remove the worst 10 days from history, you would have ended up with three times more money.

This is interesting in itself, but in terms of my desire for kids to get data goggles and to look at claims and cry, “evidence please!” this is perfect. Because we can do just that: go online, get the data he’s talking about, load it into Fathom, and see if this claim is correct.

## Measurement: invented, inexact, and indirect

This is another topic I want to write about. I did speak about it a few years ago at Asilomar, but it’s still kinda half-baked and is worth revisiting here because of how it fits with what I want to do in class this year. It will be interesting to look back in May and see what its role was. So, here we go:

Like many math educators, I used to dismiss the “measurement” strand. I thought of it as the weak sister of the NCTM content standards, nowhere near the importance of geometry or algebra, or even the late lamented discrete math. But I have seen the light, and now it’s one of my favorites. Not for how NCTM represents it, but for the juicy stuff that got left out.

I like the “rule of three” slogan of the title: Invented, inexact, and indirect. Oddly, I have trouble remembering it, and I created it. This suggests that something is wrong—but for now, let’s proceed as if it were perfect.

## Tyranny of the Center

Tyranny of the Center: a favorite phrase of mine that I keep threatening to write about. Here is a first and brief stab, inspired by my having recently used it in a comment on ThinkThankThunk.

In elementary statistics, you learn about measures of center, especially mean, median, and mode. These are important values; they stand in for the whole set of data and make it easier to deal with, especially when we make comparisons. Are we heavier now than we were 30 years ago? You bet: the average (i.e., mean) weight has gone up. Would you rather live in Shady Glen than Vulture Gulch? Sure, but the median home price is a lot higher.

We often forget, however, that the mean or median, although useful in many ways, does not necessarily reflect individual cases. You could very well find a cheap home in Shady Glen or a skinny person in 2010. Nevertheless, it is true that on average we’re fatter now—so when we picture the situation, we tend to think that everyone is fatter.

One of my goals is to immunize my students against this tendency to assume that all the individuals in a data set are just like some center value; I think it is a good habit of mind to try to look at the whole distribution whenever possible. Let’s look at a couple situations so you can see why I care so much.