The Index of Clumpiness, Part Three: One Dimension

In the last two posts, we talked about clumpiness in two-dimensional “star fields.”

  • In the first, we discussed the problem in general and used a measure of clumpiness created by taking the mean of the distances from the stars to their nearest neighbors. The smaller this number, the clumpier the field.
  • In the second, we divided the field up into bins (“cells”) and found the variance of the counts in the bins. The larger this number, the clumpier the field.

Both of these schemes worked, but the second seemed to work a little better, at least the way we had it set up.

We also saw that this was pretty complicated, and we didn’t even touch the details of how to compute these numbers. So this time we’ll look at a version of the same problem that’s easier to wrap our heads around, by reducing its dimension from 2 to 1.  This is often a good strategy for making things more understandable.

Where do we see one-dimensional clumpiness? Here’s an example:

One day, a few years ago, I had some time to kill at George Bush Intercontinental, IAH, the big Houston airport. If you’ve been to big airports, you know that the geometry of how to fit airplanes next to buildings often creates vast, sprawling concourses. In one part of IAH (I think in Terminal C) there’s a long, wide corridor connecting the rest of the airport to a hub with a slew of gates. But this corridor, many yards long, had no gates, no restaurants, no shoe-shine stands, no rest rooms. It was just a corridor. But it did have seats along the side, so I sat down to rest and people-watch.

Continue reading The Index of Clumpiness, Part Three: One Dimension

The Index of Clumpiness, Part Two

Last time, we discussed random and not-so-random star fields, and saw how we could use the mean of the minimum distances between stars as a measure of clumpiness. The smaller the mean minimum distance, the more clumpy.

Star fields of different clumpiness, from K = 0.0 (no stars are in the clump; they’re all random) to K = 0.5 to K = 1.0 (all stars are in the big clump)

What other measures could we use?

It turns out that the Professionals have some. I bet there are a lot of them, but the one I dimly remembered from my undergraduate days was the “index of clumpiness,” made popular—at least among astronomy students—by Neyman (that Neyman), Scott, and Shane in the mid-50s. They were studying Shane (& Wirtanen)’s catalog of galaxies and studying the galaxies’ clustering. We are simply asking, is there clustering? They went much further, and asked, how much clustering is there, and what are its characteristics?

They are the Big Dogs in this park, so we will take lessons from them. They began with a lovely idea: instead of looking at the galaxies (or stars) as individuals, divide up the sky into smaller regions, and count how many fall in each region.

Continue reading The Index of Clumpiness, Part Two

What’s Modeling Good For?

What’s the purpose of mathematical modeling? The easy answer is something like, to understand the real world. When I look more deeply, however, I see distinct reasons to model—and to model in the classroom. I hope that trying to define these will help me clarify my thinking and shed light on some of the worries I have about how modeling might be portrayed.

(So this is the third in a series on modeling. We began with some definitions, then proceeded to look at “genres” of modeling.)

Let’s look at a few purposes and try to distinguish them. To save the casual reader time, I’ll talk about prediction, finding parameter values, and finding insight. I think the last is the most subtle and the one most likely to be missed or misused by future developers.

Maybe I’ll post more about each of these in detail later, but for now I’ll move quickly and not give extended examples.

Continue reading What’s Modeling Good For?

Reflection on Modeling

Capybara. The world’s largest rodent.

I’m writing a paper for a book, and just finished a section whose draft is worth posting. For what it’s worth, I claim here that the book publisher (Springer) will own the copyright and I’m posting this here as fair use and besides, it will get edited.

Here we go:

Modeling activities exist along a continuum of abstraction. This is important because we can choose a level of abstraction appropriate to the students we’re targeting; presumably, a sequence of activities can bring students along that continuum towards abstraction if that is our goal.

As an example, consider this problem:

What are the dimensions of the Queen’s two pet pens?
The Queen wants you to use a total of 100 meters of fence to build a Circular pen for her pet Capybara and a Square pen for her pet Sloth. Because she prizes her pets, she wants the pet pens paved in platinum. Because she is a prudent queen, she wants you to minimize the total area.

Let’s look at approaches to this problem at several stops along this continuum:

a. Each pair of students gets 100 centimeters of string. They cut the string in an arbitrary place, form one piece into a circle and the other into a square, measure the dimensions of the figures, and calculate the areas. Glue or tape these to pieces of paper. The class makes a display of these shapes and their areas, organizes them—perhaps by the sizes of the squares, and draws a conclusion about the approximate dimensions of the minimum-area enclosures.

b. Same as above, but we plot them on a graph. A sketch of the curve through the points helps us figure out the dimensions and the minimum area.

Using Fathom to analyze area data. Sliders control (and display) parameter values. I have suppressed the residual plot, which is essential for getting a good fit.

c. This time we enter the data into dynamic data software, guess that the points fit a parabola, and enter a quadratic in vertex form, adjusting its parameters to fit the data. We see that two of these parameters are the side of the square and the minimum area.

d. Instead of making the shapes with string, we draw them on paper. Any of the three previous schemes apply here; and an individual or a small group can more easily make several different sets of enclosures. Here, however, the students need to ensure that the total perimeter is constant—the string no longer enforces the constraint. Note that we are still using specific dimensions.

e. We use dynamic geometry software to enforce the constraint; we drag a point along a segment to indicate where to divide the fence. We instruct the software to draw the enclosures and calculate the area. (In 2014, Dan Meyer did a number on a related problem and made two terrific dynamic geometry widgets, Act One and Act Two.)

f. We make a diagram, but use a variable for the length of a side. Using that, we write expressions for the areas of the figures and plot their sum as a function of the side length. We read the minimum off the graph.

g. As above, but we use algebraic techniques (including completing the square) to convert the expression to vertex form, from which we read the exact solutions. In this version, we might not even have plotted the function.

h. As above, but we avoid some messy algebra by using calculus.

Now let’s comment on these different versions.

Continue reading Reflection on Modeling