Reflection on 538, Trump, and Bayes

Was the run-up to the recent election an example of failed statistics? Pundits have been saying how bad the polling was. Sure, there might have been some things pollsters could have done better, but consider: FiveThirtyEight, on the morning of the election, gave Trump a 28.6% chance of winning.

And things with a probability of 1 in 4 (or, in this case, 2 in 7:) happen all the time.

2016-11-08prediction
Prediction by FiveThirtyEight on the morning of election day.

This post is not about what the pollsters could have done better, but rather, how should we communicate uncertainty to the public? We humans seem to want certainty that isn’t there, so stats gives us ways of telling the consumer how much certainty there is.

In a traditional stats class, we learn about confidence intervals: a poll does not tell us the true population proportion, but we can calculate a range of plausible values for that unknown parameter.  We attach that range to poll results as a margin of error: Hillary is leading 51–49, but there’s a 4% margin of error.

(Pundits say it’s a “statistical dead heat,” but that is somehow unsatisfying. As a member of the public, I still think, “but she is still ahead, right?”)

Bayesians might say that the 28.6% figure (a posterior probability, based on the evidence in the polls) represents what people really want to know, closer to human understanding than a confidence interval or P-value.

My “d’oh!” epiphany of a couple days ago was that the Bayesian percentage and the idea of a margin of error are both ways of expressing uncertainty in the prediction. They mean somewhat different things, but they serve that same purpose.

Yet which is better? Which way of expressing uncertainty is more likely to give a member of the public (or me) the wrong idea, and lead me to be more surprised than I should be? My gut feeling is that the probability formulation is less misleading, but that it is not enough: we still need to learn to interpret results of uncertain events and get a better intuition for what that probability means.

Okay, Ph.D. students. That’s a good nugget for a dissertation.

Meanwhile, consider: we read predictions for rain, which always come in the form of probabilities. Suppose they say there’s a 50% (or whatever) chance of rain this afternoon. Two questions:

  • Do you take an umbrella?
  • If it doesn’t rain, do you think, “the prediction was wrong?”

Bayes is Baaack

Bayes illustration
Screen shot from Fathom showing prior (left) and posterior (right) distributions for a situation where you flip a coin 8 times and heads comes up once. Theta is the imagined probability of heads for the coin.

Actually teaching every day again has seriously cut into my already-sporadic posting. So let me be brief, and hope I can get back soon with the many insights that are rattling around and beg to be written down so I don’t lose them.

Here’s what I just posted on the apstat listserv; refer to the illustration above:

I’ve been trying to understand Bayesian inference, and have been blogging about my early attempts both to understand the basics and to assess how teachable it might be. In the course of that (extremely sporadic) work, I just got beyond simple discrete situations, gritted my teeth, and decided to tackle how you update a prior distribution of a parameter (e.g., a probability) and update it with data to get a posterior distribution. I was thinking I’d do it in Python, but decided to try it in Fathom first.

It worked really well. I made a Fathom doc in which you repeatedly flip a coin of unknown fairness, that is, P( heads ) is somewhere between 0 and 1. You can choose between two priors (or make your own) and see how the posterior changes as you increase the number of flips or change the number of heads.

Since it’s Fathom, it updates dynamically…

Not an AP topic. But should it be?

Here’s a link to the post, from which you can get the file. I hope you can get access without being a member. Let me know if you can’t and I’ll just email it to you.

Science and Bayes

Right now, I’m pedaling really hard as I’m teaching a super-compressed (3 hours per day) math class for secondary-credential students. That’s my excuse for the slow-down in Bayes posts. The other being the ongoing problem that it takes me hours to write one of those; how Real Bloggers (the ones with more than about 6 readers) manage it I still do not understand.

So yesterday I dropped into my students’ morning class (physics) and heard the instructor (Dave Keeports) discuss the nature of science. Right up my alley, given my heartbreaking NSF project on teaching about the nature (and practice) of science. Also, the underlying logic of (Frequentist) statistical inference is a lot like the underlying logic of science (I’ve even written about it, e.g., here).

Anyway: Dave emphasized how you can never prove that a hypothesis is true, but that you can prove it false. Then he went on a little riff: suppose you have a hypothesis and you perform an experiment, and the results are just what your hypothesis predicts. Does that prove the hypothesis is true? (“No!” respond the students) Okay, so you do another experiment. Now do you know? (“No!”) But now you do a few dozen experiments, coming at the problem from different angles. Now do you know it’s true? (“No!”) 

But wait—don’t you eventually get convinced that it’s probably true? He went on to talk about how, when we have a great body of evidence and general agreement, hypotheses can become “Laws,” and somewhere in there, we have coherent collections of hypotheses and data that warrant calling something a “theory,” at least in common parlance.

He didn’t stress this, but it was really interesting to see how he slid from firm logic to the introduction of opinion. After all, what constitutes enough evidence to consider a hypothesis accepted? It’s subjective. And it’s just like Bayesian inference, really just like our hypothesis about the coin: each additional head further cements our belief that the coin is double-headed, but it’s always possible that it was a fair coin.

Philosophers of science must have applied Bayesian reasoning to this issue. 

A Bayesian Example: Two coins, three heads.

As laid out (apparently not too effectively) here, I’m on a quest, not only finally to learn about Bayesian inference, but also to assess how teachable it is. Of course I knew the basic basics, but anything in stats is notoriously easy to get wrong, and hard to teach well. So you can think of this in two complementary ways:

  • I’m trying to ground my understanding and explanations in basic principles rather than leaping to higher-falutin’ solutions, however elegant; and
  • I’m watching my own wrestling with the issues, seeing where I might go off-track. You can think of this as trying to develop pedagogical content knowledge through introspection. Though that sounds pretty high-falutin’.

To that end, having looked critically at some examples of Bayesian inference from the first chapters of textbooks, I’m looking for a prototypical example I might use if I were teaching this stuff.  I liked the M&Ms example in the previous post, but here is one that’s simpler—yet one which we can still extend.

USCoinsThere are two coins. One is fair. The other is two-headed. You pick one at random and flip it. Of course, it comes up heads. What’s the probability that you picked the fair coin?

Continue reading A Bayesian Example: Two coins, three heads.

The Search for a Great Bayesian Example

When we teach about the Pythagorean Theorem, we almost always, at some point, use a 3-4-5 triangle. The numbers are friendly, and they work. We don’t usually make this explicit, but I bet that many of us also carry that triangle around in our own heads as an internal prototype for how right triangles work—and we hope our students will, too. (The sine-cosine-1 triangle is another such prototype that develops later.)

In teaching about (frequentist) hypothesis testing, I use the Aunt Belinda problem as a prototype for testing a proportion (against 0.5). It’s specific to me—not as universal as 3-4-5.

Part of this Bayesian quest, I realized, is to find a great example or two that really make Bayesian inference clear: some context and calculation that we can return to to disconfuse ourselves when we need it.

The Paper Cup Example

Here’s the one I was thinking about. I’ll describe it here; later I’ll explain what I think is wrong with it.

I like including empirical probability alongside the theoretical. Suppose you toss a paper cup ten times, and 8 of those times it lands on its side. At that point, from an empirical perspective, P( side ) = 8/10. It’s the best information we have about the cup. Now we toss it again and it lands on its side. Now the empirical probability changes to 9/11.

How can we use a Bayesian mechanism, with 8/10 as the prior, to generate the posterior probability of 9/11?

It seemed to me (wrongly) that this was a natural. Continue reading The Search for a Great Bayesian Example

Early Bump in the Bayesian Road: a Search for Intuition

Last time, I introduced a quest—it’s time I learned more about Bayesian inference—and admitted how hard some of it is. I wrote,

The minute I take it out of context, or even very far from the ability to look at the picture, I get amazingly flummoxed by the abstraction. I mean,

P(A \mid B) = \frac{P(A)P(B \mid A)}{P(B)}

just doesn’t roll of the tongue. I have to look it up in a way that I never have to with Pythagoras, or the quadratic formula, or rules of logs (except for changing bases, which feels exactly like this), or equations in kinematics.

Which prompted this comment from gasstationwithoutpumps:

I find it easiest just to keep coming back to the definition of conditional probability P(A|B) = P(A & B) / P(B). There is no complexity here…(and more)

Which is true, of course. But for this post I’d like to focus on the intuition, not the math. That is, I’m a mathy-sciencey person learning something new, trying to record myself in the act of learning it. And here’s this bump in the road: What’s up with my having so much trouble with a pretty simple formula? (And what can I learn about what my own students are going through?) Continue reading Early Bump in the Bayesian Road: a Search for Intuition

A Closet Bayesian

At least that’s how I’ve described myself, but it’s a weak sort of Bayesianism because I’ve never really learned how to do Bayesian inference.

It’s time that chapter came to a close. So, with luck, this is the first in a series of posts (all tagged with “Bayes”) in which I finally try to learn how to do Bayesian inference—and report on what happens, especially, on what is confusing.

Bayesian rumors I’ve heard

Let’s begin with what I know—or think I know—already. Continue reading A Closet Bayesian