## Data Moves with CO2

The concentration of CO2 in the atmosphere is rising, and we have good data on that from, among other sources, atmospheric measurements that have been taken near the summit of Mauna Loa, in Hawaii, for decades.

Here is a link to monthly data through September 2018, as a CODAP document. There’s a clear upward trend. CO2 concentration (mole fraction, parts per million) as a function of time, here represented as a “decimal year.”

Each of the 726 dots in the graph represents the average value for one month of data.

What do we have to do—what data moves can we make—to make better sense of the data? One thing that any beginning stats person might do is to fit a line to the data. I won’t do that here, but you can imagine what happens: the data curve upward, so the line is a poor model, but the positive slope of the line (about 1.5, which is in ppm per year) is a useful average rate of increase over the interval we’re looking at. You could consider fitting a curve, or a sequence of line segments, but we won’t do that either.

Instead, let’s point out that the swath of points is wide. There are lots of overlapping points. We should zoom in and see if there is a pattern.

Continue reading Data Moves with CO2

## Fidelity versus Clarity

Thinking about yesterday’s post, I was struck with an idea that may be obvious to many readers, and has doubtless been well-explored, but it was new to me (or I had forgotten it) so here I go, writing to help me think and remember:

The post touched on the notion that communication is an important part of data science, and that simplicity aids in communication. Furthermore, simplification is part of modelmaking.

That is, we look at unruly data with a purpose: to understand some phenomenon or to answer a question. And often, the next step is to communicate that understanding or answer to a client, be it the person who is paying us or just ourselves. “Communicating the understanding” means, essentially, encapsulating what we have found out so that we don’t have to go through the entire process again. Mean height by sex and age; 800 cases aged 5–19. NHANES, 2003.

So we might boil the data down and make a really cool, elegant visualization. We hold onto that graphic, and carry it with us mentally in order to understand the underlying phenomenon, for example, that graph of mean height by sex and age in order to have an internal idea—a model—for sex differences in human growth.

But every model leaves something out. In this case, we don’t see the spread in heights at each age, and we don’t see the overlap between females and males. So we could go further, and include more data in the graph, but eventually we would get a graph that was so unwieldy that we couldn’t use it to maintain that same ease of understanding. It would require more study every time we needed it. Of course, the appropriate level of detail depends on the context, the stakes, and the audience.

So there’s a tradeoff. As we make our analysis more complex, it becomes more faithful to the original data and to the world, but it also becomes harder to understand.

Which suggests this graphic:

## A Calculus Rant (with stats at the end)

Let’s look at a simple optimization problem. Bear with me, because the point is not the problem itself, but in what we have to know and do in order to solve it. Here we go:

Suppose you have a string 12 cm long. You form it into the shape of a rectangle. What shape gives you the maximum area?

Traditionally, how do we expect students to solve this in a calculus class? Here is one of several approaches, in excruciating detail: Continue reading A Calculus Rant (with stats at the end)

## Chord Star in the Classroom

A million thanks to Zoya Voskoboynikov and her two sections of “regular” calculus at Lick for letting me come litter their otherwise pure math class with actual data. Of course, it was after the AP exam, and these are last-quarter seniors, so my being there didn’t interfere with any learning they needed to get through.

It worked great. It had what I most wanted: the aha experience of arriving at the destination by another route. Fortunately (and unsurprisingly), none of these successful math students remembered the theorem from the geometry class they took as frosh.

### What we did

1. I set up the problem and had them predict, informally, what the function would look like. The main purpose of this is to orient students to what we’ll be measuring and to the idea that if you measure two quantities, you can see their relationship in a graph. Continue reading Chord Star in the Classroom

## Beating the Modeling Drum Hoping desperately it’s not also a dead horse…

We just did a three-post sequence about “Chord Stars,” finishing up with how we could use insights from data to find radii of curvature remotely, that is, without ever finding the center of the circle. There’s a lot to discuss about that process; this post is part of that discussion.

In particular, it’s an interesting example of modeling. Quite a while ago I was worrying about the definition of modeling, not simply to get it “right”—many people model in different ways—but rather to try to identify things that we were pretty sure demonstrated modeling. Part of my anxiety, as the Core Standards lumber into classrooms, is that people will carelessly define modeling as “real-world” (or something equally weak) and we will lose a great opportunity to improve math education.

I often think of modeling in terms of using functions to model data. That’s partly because some of the coolest, most wonderful math experiences I’ve had have revolved around finding a function that was a good approximation to data. The process of measurement, improving those measurements, finding a suitable function, getting insight about the function as I wrangled it, and getting insight into the situation and the data from the function—all that together is an intoxicating cocktail of mathy-worldy wonderfulness.

But it’s not all there is to modeling, so I want to pause to point out another modeling genre (one of the ones I listed in this old post) that just appeared in Chord Star 3, namely, modeling real-world stuff with geometrical objects.

In fact, here are a curb with tools, and the relevant part of a Sketchpad sketch:  They clearly resemble each other, but I want to make two observations: Continue reading Beating the Modeling Drum

## Chord Star 3: Remote Radii Suppose you find some big curved thing out in the world. Some things are curved more tightly than others. But how much more? How can we put a number on how tightly curved something is?

One way is to figure out the radius of curvature. The smaller the radius, the tighter the curve. (Would you tell students this at the beginning? Of course not. But I can’t describe how this can work without giving things away. So consider this a report on my own investigation.)

Let’s apply what we learned two posts ago. To review, we found out that if you pick a point inside a circle, and run a chord through it, the point divides the chord into two segments. The lengths of those segments are inversely proportional, that is, their product is a constant—it’s the same no matter which chord you pick.

Then, last time, we saw how that product varies with the point’s distance from the center.

Let’s see how we can use this to measure radii of curves out in the world. The cool thing is that we can do this remotely. Unlike most radii in school geometry, we can figure out the radius of curvature without ever finding the center of the circle.

The picture above is a hint. If that’s enough for you, don’t read further! Go do it! Continue reading Chord Star 3: Remote Radii

## Chord Star 2: Choosing different points Last time we saw how you could make a “chord star” by picking a point inside a circle and drawing chords through that point. Then we measured the two lengths of the partial chords (let’s call them $L_1$ and $L_2$) and plotted them against one another. We got a rectangular hyperbola, suggesting (or confirming if we remembered the geometry) that $L_1 L_2 = k$, some constant.

But we asked, “what effect does your choice of point have on the graph and the data?” So of course we’ll take an empirical approach and try it. If you have a classroom full of students, and they used the same-sized circle and picked their own points, you could immediately compare the points they chose to the functions they generated. Or you could do it as an individual. The photo shows what this might look like, and here is a detail: Now we’ll put the data in a table, but this time,

• In addition to L1 and L2, we’ll record R, the distance from the center to the point. It may not be obvious to students at first that all points the same distance from the center (or the edge) will give the same data, but I’ll assume we get that.
• We’ll double the data by recording the data in the reverse order as well. It makes the graph look better.

Here’s the graph, coded by distance (in cm) of the point from the center.

## Chord Star: Another Geometry-Function-Modeling Thing Last time I wrote about a super-simple geometry situation and how we could turn it into an activity that connected it to linear functions. What does it take to turn something from geometry into a function? This is an interesting question; in my explorations here I’ve found it helpful to look for relationships. And what I mean by that is, where do you have two quantities (in geometry, often distances, but it could be angles or areas or…) where one varies when you change the other.

So one strategy is, think of some theorem or principle, and see if you can find the relationship. To that end, remember teaching geometry and that cool theorem where if you have two chords that cross, the products are the same? That’s where this comes from. Oddly, it took a while to figure out what to plot against what to get a revealing function, but here we go.

Make a circle. Pick a point not near the center, but not too close to the circle itself. Draw a chord through that point. Measure the two segments. Call them $L_1$ and $L_2$. Or even x and y. Record the data. Continue reading Chord Star: Another Geometry-Function-Modeling Thing

## Isosceles EGADs: Functions, Geometry, and Modeling In trying to come up with more activities for EGADs (Exploring (or maybe Enriching) Geometry and Algebra though Data), the following dropped into my lap. Because it’s so simple and so interesting, I’d better write it down…

Everybody get a sheet of paper and draw an isosceles triangle. Try to make your triangle big enough to kinda fill the page, but also try to make it different from those around you. Make your triangle pretty carefully, but don’t measure and don’t use a straightedge.

Individuals can do this too, but I’m writing this as if it’s a class activity. The idea is to get a wide variety of shapes. It is not vital that these just be sketched, but (a) I think that makes the data more interesting, (b) it opens the possibility to drawing more carefully later, and (c) it’s much faster.

Measure the base angles and the vertex angle, and write them on the page.

If you need to introduce vocabulary, do it here. By the way, we don’t assume that these students know that the base angles should be the same. Also, we all know that measuring angles is hard, right?

We’re going to plot the measurements from the whole class. So write your angle measurements on the board.

You may need to help organize this. Will we plot both base angles? Up to you. If so, consider having each kid make two entries in the T-table or whatever.

Now make a graph. Put vertex angle on the horizontal axis and base angle on the vertical. Think about the range of values before you make your axes!

You may want to discuss what goes on which axis. Without having done this with kids, I bet most of us think of the vertex angle as the independent variable and base angle as the dependent. I, at least, think of the vertex angle as the defining angle in an isosceles triangle. This also has the happy consequence of requiring a change of axes in order to get the coolest version of the formula.

At any rate, the graph should look linear. Address outliers (probably due to bad measurement).

Draw the line you think best approximates the data. Find its equation.

Be ready to present your data and line, and explain as much as you can about the line. In particular, why does it have that slope and intercept?

In the spirit of SERP “Poster problems” this could be a poster-plus-gallery-walk event.

## …and why somebody might try to convince you they are.

It’s even in the Core Standards. This is taken out of context—but not very far:

A model can be very simple, such as writing total cost as a product of unit price and number bought… (Common Core, p 72)

Seriously?

Okay, I could make a case for it, but I won’t.

I’m becoming more convinced that the real hallmark of modeling is simplification  (see this post for more).

Modeling is not simply using math on real-world problems, though that is a Good Thing; you can model to help with pure math as well. And I bet we could find good real-world problems that don’t involve modeling.

But back to simplification. The key element (I believe this afternoon anyway) is taking something and using math in a way that makes it simpler, less complicated than the thing itself. We model to make things tractable. We can handle the model even when the thing it represents it too complicated. If it’s a good model, it captures the essence of what we’re looking at; and exactly what that means may depend on the specific context.

• We might model a hexnut as a hexagonal prism with a cylindrical hole, and use that geometrical model to find a volume. We avoid the threads and the easing on the corners: they’re too complicated—but we hope our model captures the essence of the hexnut.
• We might model some messy data as a line or a curve. We can’t make a reasonable prediction from the mess, but we can with a function: just plug in a value and calculate.
• We might explore the behavior of a system of linked differential equations by creating a numerical model, a system of difference equations we can evaluate on a computer. It’s conceptually simpler (for the computer at least), so we sacrifice some precision for tractability.
• We might even take all the complexity of Americans and do a Census. When we do, we create a data model: the structure for the information we will collect. We have only approximately captured the people’s information (this is the Census, right, not the NSA). We hope our data has the essence that we need to know—but there is a huge amount of detail that we have ignored. Like the threads, like the deviations in the scatter plot, like the inaccuracies in the numerical model.

What does this have to do with word problems?

Suppose we ask, if Eduardo buys four cans of orange juice for \$2.49 a can, how much does he pay altogether?

There is math here, no question. We can argue whether it’s real life.

But it doesn’t involve simplification. All of the information is present. There is no model, and no need for one.