So what happened in class? First, you want to see the data, right?
The basic story so far is that maybe a week ago, I let the students take the measurements, uploading the data—so we could all get everyone’s measurements—using Fathom Surveys. That worked great, but there was of course not enough time to do the analysis, so that got postponed.
And we still haven’t quite gotten through it—though they have had a couple dollops of homework to make progress—at least partly because I’m not sure the best path to take. Next class—Thursday—I finally have enough time allocated to do more, and get to the bottom of something about variation; the next step in this thread is to do The Case of the Steady Hand.
So what actually happened and why am I a little at sea when the data are so interesting?
Yesterday’s APstat listserve had a question about Fathom:
How do I create a simulation to run over and over to pick 10 employees. 2/3 of the employees are male
Since my reply to the listserve had to be in plain old text, I thought I’d reproduce it here with a couple of illustrations…
There are at least two basic strategies. I’ll address just one; this is the quick one and uses random number-ish functions. The others mostly use sampling. If you use TPS, I think it’s Chapter 8 of the Fathom guide that explains them in excruciating detail 🙂
Okay: we’re going to pick 10 employees over and over from a (large) population in which 2/3 of the employees are male.
(Why large? To avoid “without-replacement” issues. If you were simulating layoffs of 10 employees from an 18-employee company, 12 of whom were male, you would need to use sampling and make sure you were sampling without replacement.)
(1) Make a collection with one attribute, sex, and 10 cases
These are something like the entire instructions for a mini-investigation that has taken much of the second and third of our class meetings:
Mess around with U S Census data in Fathom until you notice some pattern or relationship. Then make a claim: a statement that must be either true or false. Then create a visualization (in this case, a graph) that speaks to your claim. Then make one or two sentences of commentary. These go onto one or two PowerPoint slides.
The purpose is severalfold:
You get chance to play with the data
You learn more Fathom features, largely by induction or osmosis or something; in any case, you learn them when you need them
You get to direct your own investigation
You get practice communicating in writing—or at least slideSpeak
I get to see how you do on all these things
We all get to try out the online assignment drop-box
In fact, it has gone pretty well. We started on Wednesday (the second class) with my demonstrating how to get anything other than the default variables. I modeled the make-a-claim and make-a-graph part by showing how to compare incomes between men and women.
There’s a great activity at the beginning of Workshop Statistics where kids write a couple of sentences about why they’re taking the course, and then construct the distribution of word lengths. The main point is to ask, “are all word lengths the same?” Answer: no, duh. Right: they vary. It’s not that an individual word changes its length, but that the idea word length varies from word to word. So it’s a variable, in a way that’s a little different from the variables they’re used to from algebra.
But what does the distribution look like? Rather than look it up, I found Hamlet’s “to be or not to be” soliloquy online, pasted it into my favorite text processor (TextMate) and did a bunch of global substitutions so that every word was on its own line. (I also stripped out hyphens and apostrophes and other punctuation, which may not always be appropriate, but never mind. But I think of ’tis as a three-, not a four-letter word.) Then a quick dump into a Fathom collection, and a new attribute (or variable) with a formula like stringLength(WORDS) and you’re all set. This process takes enough fluency that it’s an inappropriate activity for the kids in my class, at least, but the results are interesting enough to share, as in the illustration at right.
In this post, we saw Kent “Toast” French, the world’s fastest clapper, clap at a rate he claimed to be 14 claps per second. I said I thought I could use WireTap Studio to look at the data. Sure enough, it works; here is a screen shot of part of the audio. I get more like 13 cps, or maybe a little less. I have not looked through the whole sequence to see if he ever hit 14.
It would be lovely to use something like Fathom for the whole clip so we could calculate each interval and see how that changes over time.