Capture/Recapture Part Two

Trying to get yesterday’s post out quickly, I touched only lightly on how to set up the various simulations. So consider them exercises for the intermediate-level simulation maker. I find it interesting how, right after a semester of teaching this stuff, I still have to stop and think how it needs to work. What am I varying? What distribution am I looking at? What does it represent?

Seeing how the two approaches fit together, yet are so different, helps illuminate why confidence intervals can be so tricky.

Anyway, I promised a Very Compelling Real-Life Application of This Technique. I had thought about talking to fisheries people, but even though capture/recapture somehow is nearly always introduced in a fish context, of course it doesn’t have to be. Here we go:

Human Rights and Capture/Recapture

I’ve just recently been introduced to an outfit called the Human Rights Data Analysis Group. Can’t beat them for statistics that matter, and I really have to say, a lot of the explanations and writing on their site is excellent. If you’re looking for Post-AP ideas, as well as caveats about data for everyone, this is a great place to go.

One of the things they do is try to figure out how many people get killed in various trouble areas and in particular events. You get one estimate from some left-leaning NGO. You get another from the Catholics. Information is hard to get, and lists of the dead are incomplete. So it’s not surprising that different groups get different estimates. Whom do you believe?

Continue reading

Posted in content, curriculum development, philosophy, simulation | Tagged , , | 1 Comment

Capture/Recapture Part One

Kids doing capture/recapture. From Dan Meyer.

If you’ve been awake and paying attention to stats education, you must have come across capture/recapture and associated classroom activities.

The idea is that you catch 20 fish in a lake and tag them. The next day, you catch 25 fish and note that 5 are tagged. The question is, how many fish are in the lake? The canonical answer is 100: having 5 tagged in the 25 suggests that 1/5 of all fish are tagged; if 20 fish are tagged, then the total number must be 100. Right?

Sort of. After all, we’ve made a lot of assumptions, such as that the fish instantly and perfectly mix, and that when you fish you catch a random sample of the fish in the lake. Not likely. But even supposing that were true, there must be sampling variability: if there were 20 out of 100 tagged, and you catch 25, you will not always catch 5 tagged fish; and then, looking at it the twisted, Bayesian-smelling other way, if you did catch 5, there are lots of other plausible numbers of fish there might be in the lake.

Let’s do those simulations.

Continue reading

Posted in content, curriculum development, philosophy, Randomization | Tagged , | 2 Comments

Talking is so not enough

We’re careening towards to the end of the semester in calculus, and I know I’m mostly posting about stats, but this just happened in calc and it applies everywhere.

We’ve been doing related rate problems, and had one of those classic calculus-book problems that involves a cone. Sand is being added to a pile, and we’re given that the radius of the pile is increasing at 3 inches per minute. The current radius is 3 feet; the height is 4/3 the radius; at what rate is sand being added to the pile?

Never mind that no pile of sand is shaped like that—on Earth, anyway. I gave them a sheet of questions about the pile to introduce the angle of repose, etc. I think it’s interesting and useful to be explicitly critical of problems and use that to provoke additional calculation and figuring stuff out. But I digress.

Continue reading

Posted in class reflection, content, philosophy, self-flagellation, Uncategorized | Tagged , | 1 Comment

Coming (Back) to Our Census

Reflecting on the continuing, unexpected, and frustrating malaise that is Math 102, Probability and Statistics, one of my ongoing problems has been the deterioration of Fathom. It shouldn’t matter that much that we can’t get Census data any more, but I find that I miss it a great deal; and I think that it was a big part of what made stats so engaging at Lick.

So I’ve tried to make it accessible in kinda the same way I did the NHANES data years ago.

This time we have Census data instead of health. At this page here, you specify what variables you want to download, then you see a 10-case preview of the data to see if it’s what you want, and then you can get up to 1000 cases. I’m drawing them from a 21,000 case extract from the 2013 American Community Survey, all from California. (There are a lot more cases in the file I downloaded; I just took the first 21,000 or so so we could get an idea what’s going on.)

Continue reading

Posted in class reflection, curriculum development, philosophy, self-flagellation, technology | Leave a comment

Blood in the Aisles

I don’t quite know how Beth does it! We’re using Beth Chance and Allan Rossman’s ISCAM text, and on Thursday we got to Investigation 1.6, which is a cool introduction to power. (You were a .250 hitter last season; but after working hard all winter, you’re now a .333 hitter. A huge improvement. You go to the GM asking for more money, but the GM says, I need proof. They offer you 20 at-bats to convince them you’ve improved beyond .250. You discover, though the applets, that you have only a 20% chance of rejecting their null, namely, that you’re still a .250 hitter.)

I even went to SLO to watch Beth Herself run this very activity. It seemed to go fine.

But for my class, it was not a happy experience for the students. There was a great deal of confusion about what exactly was going on, coupled with some disgruntlement that we were moving so slowly.

A number of things may be going on here: Continue reading

Posted in class reflection, content, philosophy, self-flagellation | Tagged | Leave a comment

Quick Check-in

Okay: one class down, 27 to go. The big problem right now is scheduling “lab” time, and extra hour a week that will make up the rest of the time we need to get through the material and learn the stuff that’s not in the ISCAM text, such as EDA and more probability.

I do not yet have sense of how fast we can get through some of the investigations; I have hopes that once we get the hang of it, some can be slower and more thoughtful, while others can be more practice- and application-y.

I did start with good old Aunt Belinda, for comfort sake. It’s odd; I may go more slowly—too slowly—when I’m more familiar with the approach.

I’ll know a lot more next week.

Posted in Uncategorized | Leave a comment

Teaching Stats Again

It’s Sunday. On Thursday, Math 102—Statistics and Probability—has its first meeting at Mills College, and I am allegedly in charge. This is a one-semester course, and at the college level, calculus required, in contrast to the year-long, high-school, non-AP classes I taught a few years ago.

So we will have to move pretty fast, but the students have more experience, which I hope will mostly be a good thing.

I’ve just come back from a few days at Cal Poly, watching Beth Chance and Allan Rossman actually teaching their courses, to see what the masters look like in action. It was inspiring and daunting. One thing Beth said that made me grimace was how important it was to take a few minutes to reflect on what worked. So here I am, gonna try again. I have hopes but make no promises, as this semester will be packed: I’m also teaching Calculus I and Multivariable, two more courses I’ve never taught before. I took them in college, and did well, though; OTOH, it’s been a long time since Green’s Theorem: my 40th reunion is this spring.

So for any of you watching, some early remarks:

  • We’ll be using Beth and Allan’s newest offering, the “ISCAM” text.
  • I will of course be using a simulation-based approach to inference. ISCAM starts that way but quickly (I think) brings in Normal-based inference and t procedures. I’m re-ordering some of their investigations to bring the Normal in later.
  • Students get Fathom for free, still, so we’ll be using that; I’ll write Fathom-based instructions to replace the ones ISCAM uses for R. It will mostly be fine; I think I saw one thing in the R code that I didn’t know how to do in Fathom.
  • At the same time, Fathom has trouble right now: under Mavericks data import from Census or the Web is broken. That was so great in the past, but now many of my handouts from before will no longer work. Arrgh.
  • Simulation-based inference is a big enough deal now that some of the Big Dogs of the movement have a blog.
  • I hope to get a link to have my students do the CAOS test so we can compare. It will also give me a nice pre-assessment so I have a clue what they know about simple stuff.
Posted in Uncategorized | Leave a comment

Bayes is Baaack

Bayes illustration

Screen shot from Fathom showing prior (left) and posterior (right) distributions for a situation where you flip a coin 8 times and heads comes up once. Theta is the imagined probability of heads for the coin.

Actually teaching every day again has seriously cut into my already-sporadic posting. So let me be brief, and hope I can get back soon with the many insights that are rattling around and beg to be written down so I don’t lose them.

Here’s what I just posted on the apstat listserv; refer to the illustration above:

I’ve been trying to understand Bayesian inference, and have been blogging about my early attempts both to understand the basics and to assess how teachable it might be. In the course of that (extremely sporadic) work, I just got beyond simple discrete situations, gritted my teeth, and decided to tackle how you update a prior distribution of a parameter (e.g., a probability) and update it with data to get a posterior distribution. I was thinking I’d do it in Python, but decided to try it in Fathom first.

It worked really well. I made a Fathom doc in which you repeatedly flip a coin of unknown fairness, that is, P( heads ) is somewhere between 0 and 1. You can choose between two priors (or make your own) and see how the posterior changes as you increase the number of flips or change the number of heads.

Since it’s Fathom, it updates dynamically…

Not an AP topic. But should it be?

Here’s a link to the post, from which you can get the file. I hope you can get access without being a member. Let me know if you can’t and I’ll just email it to you.

Posted in Bayesian, content, Fathom tips, modeling, technology | Leave a comment

Science and Bayes

Right now, I’m pedaling really hard as I’m teaching a super-compressed (3 hours per day) math class for secondary-credential students. That’s my excuse for the slow-down in Bayes posts. The other being the ongoing problem that it takes me hours to write one of those; how Real Bloggers (the ones with more than about 6 readers) manage it I still do not understand.

So yesterday I dropped into my students’ morning class (physics) and heard the instructor (Dave Keeports) discuss the nature of science. Right up my alley, given my heartbreaking NSF project on teaching about the nature (and practice) of science. Also, the underlying logic of (Frequentist) statistical inference is a lot like the underlying logic of science (I’ve even written about it, e.g., here).

Anyway: Dave emphasized how you can never prove that a hypothesis is true, but that you can prove it false. Then he went on a little riff: suppose you have a hypothesis and you perform an experiment, and the results are just what your hypothesis predicts. Does that prove the hypothesis is true? (“No!” respond the students) Okay, so you do another experiment. Now do you know? (“No!”) But now you do a few dozen experiments, coming at the problem from different angles. Now do you know it’s true? (“No!”) 

But wait—don’t you eventually get convinced that it’s probably true? He went on to talk about how, when we have a great body of evidence and general agreement, hypotheses can become “Laws,” and somewhere in there, we have coherent collections of hypotheses and data that warrant calling something a “theory,” at least in common parlance.

He didn’t stress this, but it was really interesting to see how he slid from firm logic to the introduction of opinion. After all, what constitutes enough evidence to consider a hypothesis accepted? It’s subjective. And it’s just like Bayesian inference, really just like our hypothesis about the coin: each additional head further cements our belief that the coin is double-headed, but it’s always possible that it was a fair coin.

Philosophers of science must have applied Bayesian reasoning to this issue. 

Posted in Bayesian, content, philosophy | Tagged , , | Leave a comment

A Bayesian Example: Two coins, three heads.

As laid out (apparently not too effectively) here, I’m on a quest, not only finally to learn about Bayesian inference, but also to assess how teachable it is. Of course I knew the basic basics, but anything in stats is notoriously easy to get wrong, and hard to teach well. So you can think of this in two complementary ways:

  • I’m trying to ground my understanding and explanations in basic principles rather than leaping to higher-falutin’ solutions, however elegant; and
  • I’m watching my own wrestling with the issues, seeing where I might go off-track. You can think of this as trying to develop pedagogical content knowledge through introspection. Though that sounds pretty high-falutin’.

To that end, having looked critically at some examples of Bayesian inference from the first chapters of textbooks, I’m looking for a prototypical example I might use if I were teaching this stuff.  I liked the M&Ms example in the previous post, but here is one that’s simpler—yet one which we can still extend.

USCoinsThere are two coins. One is fair. The other is two-headed. You pick one at random and flip it. Of course, it comes up heads. What’s the probability that you picked the fair coin?

Continue reading

Posted in Bayesian, content, curriculum development, philosophy | Tagged , , , | 1 Comment