Ping Pong Ball Bounce Redux

Long ago (2007) Bryan Cooley and I wrote a set of physics labs; in one of them we had students bounce a ping-pong ball. You know the sound; it’s like this:

Ping-pong ball bouncing on my kitchen counter

For the lab, we had students record the sound at 1000 points per second using a Vernier microphone. Using the resulting data, students could identify the times of the “pocks” and then see how the times between the pocks — the “interpock intervals” — decreased exponentially. This is a cool take on the old Algebra 2/Precalculus activity about bouncing balls where you measure drop heights; using sound and the technology, you can get more bounces and more accuracy.

A typical graph of the sound looks like this:

Graph of the sound from the audio above. In CODAP. Time in milliseconds.

And a graph of the interpock intervals looks like this:

Continue reading Ping Pong Ball Bounce Redux
Advertisements

Data Moves with CO2

The concentration of CO2 in the atmosphere is rising, and we have good data on that from, among other sources, atmospheric measurements that have been taken near the summit of Mauna Loa, in Hawaii, for decades.

Here is a link to monthly data through September 2018, as a CODAP document. There’s a clear upward trend.

CO2 concentration (mole fraction, parts per million) as a function of time, here represented as a “decimal year.”

Each of the 726 dots in the graph represents the average value for one month of data.

What do we have to do—what data moves can we make—to make better sense of the data? One thing that any beginning stats person might do is to fit a line to the data. I won’t do that here, but you can imagine what happens: the data curve upward, so the line is a poor model, but the positive slope of the line (about 1.5, which is in ppm per year) is a useful average rate of increase over the interval we’re looking at. You could consider fitting a curve, or a sequence of line segments, but we won’t do that either.

Instead, let’s point out that the swath of points is wide. There are lots of overlapping points. We should zoom in and see if there is a pattern.

Continue reading Data Moves with CO2

Data Moves and Simplification

or, What I should have emphasized more at NCTM

I’m just back from NCTM 2018 in Washington DC where I gave a brief workshop that introduced ideas in data science education and the use of CODAP to a very nice group in a room that—well, NCTM and the Marriott Marquis were doing their best, but we really need a different way of doing technology at these big conferences.

Anyway: at the end of a fairly wide-ranging presentation in which my main goal was for participants to get their hands dirty—get into the data, get a feel for the tools, have data science on their radar—it was inevitable that I would feel:

  • that I talked too much; and
  • that there were important things I should have said.

Sigh. Let’s address the latter. Here is a take-away I wish I had set up better:

In data science, things are often too complicated. So one step is to simplify things; and some data moves, by their nature, simplify.

Complication is related to being awash in data (see this post…); it can come from the sheer quantity of data as well as things like being multivariate or otherwise just containing a lot of stuff we’re not interested in right now.

To cut through that complication, we often filter or summarize, and to do those, we often group. To give some examples, I will look again at the data that appeared in the cards metaphor post, but with a different slant.

Here we go: NHANES data on height, age, and sex. At the end of the process, we will see this graph:

nhanes 800 means
Mean height by sex and age; 800 cases aged 5–19. NHANES, 2003.

And the graph tells a compelling story: boys and girls are roughly the same height—OK, girls are a little taller at ages 10–12—but starting at about age 13, girls’ heights level off, while the boys continue growing for about two more years.

We arrived at this after a bunch of analysis. But how did we start?

Continue reading Data Moves and Simplification

Trees. And. Diagnosis. (Part two)

Last time we introduced decision trees and a tool we’ve made to explore them. With that tool, embedded in a simple game (Arbor), you can generate data from alien creatures with a simulated malady, figure out its predictors, and make a decision tree that will let you automate its diagnosis. (Here is the link to that not-quite-game.)

Your job was to get through the diseases ague and botulosis. Today I want to reflect on those two scenarios.

Ague

Ague is ridiculously simple, and with that ridiculous simplicity, the user is supposed to be able to learn the basics of the game, that is, how to “drive” the tools. One way to figure out the disease is to sort the table by health and see what matches health. Here is what the sorted table looks like:

agueTableSorted

Just scanning the various columns, you can see that health is associated with hair color.  Pink means sick, blue means well. With that insight, you can go on to diagnose individual creatures and then make a simple tree, which looks like this:

agueTree

Although there is a lot of information in the tree, users can generally figure it out. If they (or you) have trouble, they can get additional information by hovering over the boxes or the links.

Continue reading Trees. And. Diagnosis. (Part two)

Smelling Like Data Science

(Adapted from a panel after-dinner talk for the in the opening session to DSET 2017)

Nobody knows what data science is, but it permeates our lives, and it’s increasingly clear that understanding data science, and its powers and limitations, is key to good citizenship. It’s how the 21st century finds its way. Also, there are lots of jobs—good jobs—where “data scientist” is the title.

So there ought to be data science education. But what should we teach, and how should we teach it?

Let me address the second question first. There are at least three approaches to take:

  • students use data tools (i.e., pre-data-science)
  • students use data science data products 
  • students do data science

I think all three are important, but let’s focus on the third choice. It has a problem: students in school aren’t ready to do “real” data science. At least not in 2017. So I will make this claim:

We can design lessons and activities in which regular high-school students can do what amounts to proto-data-science. The situations and data might be simplified, and they might not require coding expertise, but students can actually do what they will later see as parts of sophisticated data science investigation.

That’s still pretty vague. What does this “data science lite” consist of? What “parts” can students do? To clarify this, let me admit that I have made any number of activities involving data and technology that, however good they may be—and I don’t know a better way to say this—do not smell like data science.

You know what I mean. Some things reek of data science. Google searches. Recommendation engines. The way a map app routes your car. Or dynamic visualizations like these: Continue reading Smelling Like Data Science

Modeling Hexnut Mass

Let me encourage you to go to your hardware store and get some hexnuts. You won’t regret it. Now let’s see if I can write a post about it in under, like, four hours.

(Also, get a micrometer on eBay and a sweet 0.1 gram food scale. They’re about $15 now.)

Long ago, I wrote about coins and said I would write about hexnuts. I wrote a book chapter, but never did the post. So here we go. What prompted me was thinking different kinds of models.

I have been focusing on using functions to model data plotted on a Cartesian plane, so let’s start there. Suppose you go to the hardware store and buy hexnuts in different sizes. Now you weigh them. How will the size of the nut be related to the weight?

A super-advanced, from-the-hip answer we’d like high-schoolers to give is, “probably more or less cubic, but we should check.” The more-or-less cubic part (which less-experienced high-schoolers will not offer) comes from several assumptions we make, which it would be great to force advanced students to acknowledge, namely, the hexnuts are geometrically similar, and they’re made from the same material, so they’ll have the same density. Continue reading Modeling Hexnut Mass

Model Shop! One volume done!

The Model Shop, Volume 1Hooray, I have finally finished what used to be called EGADs and is now the first volume of The Model Shop. Calling it the first volume is, of course, a treacherous decision.

So. This is a book of 42 activities that connect geometry to functions through data. There are a lot of different ways to describe it, and in the course of finishing the book, the emotional roller-coaster took me from great pride in what a great idea this was to despair over how incredibly stupid I’ve been.

I’m obviously too close to the project.

For an idea of what drove some of the book, check out the posts on the “Chord Star.”

But you can also see the basic idea in the book cover. See the spiral made of triangles? Imagine measuring the hypotenuses of those triangles, and plotting the lengths as a function of “triangle number.” That’s the graph you see. What’s a good function for modeling that data?

If we’re experienced in these things, we say, oh, it’s exponential, and the base of the exponent is the square root of 2. But if we’re less experienced, there are a lot of connections to be made.

We might think it looks exponential, and use sliders to fit a curve (for example, in Desmos or Fathom. Here is a Desmos document with the data you can play with!) and discover that the base is close to 1.4. Why should it be 1.4? Maybe we notice that if we skip a triangle, the size seems to double. And that might lead us to think that 2 is involved, and gradually work it out that root 2 will help.

Or we might start geometrically, and reason about similar triangles. And from there gradually come to realize that the a/b = c/d trope we’ve used for years, in this situation, leads to an exponential function, which doesn’t look at all like setting up a proportion.

In either case, we get to make new connections about parts of math we’ve been learning about, and we get to see that (a) you can find functions that fit data and (b) often, there’s a good, underlying, understandable reason why that function is the one that works.

I will gradually enhance the pages on the eeps site to give more examples. And of course you can buy the book on Amazon! Just click the cover image above.