Strom’s Credibility Criterion

Long ago, way back when Dark Matter had not yet been inferred, I attended UC Berkeley. One day, a fellow astronomy grad student mentioned Strom’s Credibility Criterion in a (possibly beer-fueled) conversation—attributed to astronomer Stephen Strom, who was at U Mass Amherst at the time.

It went something like this:

Don’t believe any set of data unless at least one data point deviates dramatically from the model.

The principle has stuck with me over the years, bubbling in the background. It rose to the surface in a recent trip to the mysterious east (Maine) to visit a physics classroom field-testing materials from a project I advise called InquirySpace.


There is a great deal to say about this trip, including some really great experiences using vague questions and prediction in the classroom, but this incident is about precision, data, and habits of mind. To get there, you need some background.

Students were investigating the motion of hanging springs with weights attached. (Vague question: how fast will the spring go up and down? Answer: it depends. Follow-up: depends on what? Answers: many, including weight and ‘how far you pull it,’ i.e., amplitude.)

So we make better definitions and better questions, get the equipment, and measure. In one phase of this multi-day investigation, students studied how the amplitude affected the period of this vertical spring thing.

If you remember your high-school physics, you may recall that amplitude has no (first-order) effect (just as weight has no effect in a pendulum). So it was interesting to have students make a pre-measurement prediction (often, that the relationship would be linear and increasing) and then turn them loose to discover that there is no effect and to try to explain why.

Enter Strom, after a fashion

Let us leave the issue of how the students measured period for another post. But one very capable and conscientious group found the following periods, in seconds, for four different amplitudes:

0.8, 0.8, 0.8, 0.8

Many of my colleagues in the project were happy with this result. The students found out—and commented—that their prediction had been wrong. So the main point of the lesson was achieved. But as a data guy, I heard the echo of Stephen Strom.

I didn’t expect a dramatic deviation, but I wanted to see some variability. It seemed to me just good policy to wonder if those 0.8’s were as identical as they looked. Are they all 0.81, for example? All 0.813? Or are they really 0.79, 0.84, 0.80, and 0.78?

I asked the students. One girl said that they could be different, but whatever they were, they would certainly all round to 0.8, so that’s what she should report. I pushed her on this. She made two points:

  • It’s bad policy to report too many decimal places.
  • Any variability would be due to things they could not control such as wind or the way they released the spring. So any deviation was irrelevant to the main point, namely, that amplitude has no effect.

Isn’t this interesting? The student is right in so many ways. She has thought about uncontrolled variables and recognizes that they introduce variability. Her sense of the physics, and of the model, is really good.

But without measuring the variability, she can’t say how much variability there might be. She can’t detect fine deviations from her new, no-effect model.

At the high-school level, though, should we be asking students to dig deeper to find the variability in data, especially in physics as opposed to stats?

I think so, but not so they can find higher-order effects or calculate some measure of spread. My gut reaction to the data was that I should see variability (like Strom) because data that’s too “good” looks faked (e.g., Mendel) or lazy. Laziness might be too harsh—but data without variability might at least indicate a lack of curiosity: the researcher stopped before they found out what was really going on.

Also, a lack of variability might result from a mistake: a problem with the procedure so that it artificially gives the same value regardless of the situation, the data-analytic equivalent of a stuck gauge. So you should check it out.

Not that these students were lazy, or faked their data, or had a stuck gauge. They were hard-working and smart. But there is a data practice here, a habit of mind they didn’t have. They should see a red flag when data are “too good.” Data that fit too well should propel students at the high-school level to measure more precisely so they can find the inevitable variability.

Once they see that variability, they can try to explain it or discount it. This will be informal at first, but it will gradually become more quantitative and analytical. At the very least, they can acknowledge it in their write-up.

Core Standards Note

I was hoping I could make a great connection to the CCSS Standards for Mathematical Practice at #6, “Attend to Precision.” One could make that case, but reading the text, I think it’s a stretch: that standard is more concerned with precision in language, description, and argumentation than this kind of thing. The phrase “express numerical answers with a degree of precision appropriate for the problem context” appears, but I suspect they’re not thinking about these issues but more like “not giving too many decimal places.”

Author: Tim Erickson

Math-science ed freelancer and sometime math and science teacher. Currently working on various projects.

5 thoughts on “Strom’s Credibility Criterion”

  1. Great post and how true! However, being suspicious of very good data may be a sign of a maturity these students have not yet had a chance to acquire. What was more surprising to me was that there was a school where the students had the luxury of spending a few days on a project. Often we are rushed so that we get to cover all the topics the kids may see on the state exam. Good for this school!

    1. I think Tim has a good point here. Drawing conclusions in the real world is always messy and inexact. It’s too easy to “see” the answer we expect and ignore the detail. It’s important to know that messy data with a strong trend is usually the best we are going to get. Anything we can do to change the perception that the answers we learn in school are absolute is in the student’s best interest.

  2. My experience, doing physics experiments while home-schooling my son, is that outlier points are really common, and the measurements that involve them need to be repeated, often multiple times. Sometimes they are just clerical error, sometimes they are limitations of the instrumentation, and sometimes they are real, unmodeled phenomena (like ceramic capacitors having a voltage-dependent capacitance).

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: