The Index of Clumpiness, Part Three: One Dimension

In the last two posts, we talked about clumpiness in two-dimensional “star fields.”

  • In the first, we discussed the problem in general and used a measure of clumpiness created by taking the mean of the distances from the stars to their nearest neighbors. The smaller this number, the clumpier the field.
  • In the second, we divided the field up into bins (“cells”) and found the variance of the counts in the bins. The larger this number, the clumpier the field.

Both of these schemes worked, but the second seemed to work a little better, at least the way we had it set up.

We also saw that this was pretty complicated, and we didn’t even touch the details of how to compute these numbers. So this time we’ll look at a version of the same problem that’s easier to wrap our heads around, by reducing its dimension from 2 to 1.  This is often a good strategy for making things more understandable.

Where do we see one-dimensional clumpiness? Here’s an example:

One day, a few years ago, I had some time to kill at George Bush Intercontinental, IAH, the big Houston airport. If you’ve been to big airports, you know that the geometry of how to fit airplanes next to buildings often creates vast, sprawling concourses. In one part of IAH (I think in Terminal C) there’s a long, wide corridor connecting the rest of the airport to a hub with a slew of gates. But this corridor, many yards long, had no gates, no restaurants, no shoe-shine stands, no rest rooms. It was just a corridor. But it did have seats along the side, so I sat down to rest and people-watch.

Continue reading The Index of Clumpiness, Part Three: One Dimension

Advertisements

The Index of Clumpiness, Part Two

Last time, we discussed random and not-so-random star fields, and saw how we could use the mean of the minimum distances between stars as a measure of clumpiness. The smaller the mean minimum distance, the more clumpy.

1000randomK.K=0,.5,1
Star fields of different clumpiness, from K = 0.0 (no stars are in the clump; they’re all random) to K = 0.5 to K = 1.0 (all stars are in the big clump)

What other measures could we use?

It turns out that the Professionals have some. I bet there are a lot of them, but the one I dimly remembered from my undergraduate days was the “index of clumpiness,” made popular—at least among astronomy students—by Neyman (that Neyman), Scott, and Shane in the mid-50s. They were studying Shane (& Wirtanen)’s catalog of galaxies and studying the galaxies’ clustering. We are simply asking, is there clustering? They went much further, and asked, how much clustering is there, and what are its characteristics?

They are the Big Dogs in this park, so we will take lessons from them. They began with a lovely idea: instead of looking at the galaxies (or stars) as individuals, divide up the sky into smaller regions, and count how many fall in each region.

Continue reading The Index of Clumpiness, Part Two

The Index of Clumpiness, Part One

1000random.K=0
1000 points. All random. The colors indicate how close the nearest neighbor is.

There really is such a thing. Some background: The illustration shows a random collection of 1000 dots. Each coordinate (x and y) is a (pseudo-)random number in the range [0, 1) — multiplied by 300 to get a reasonable number of pixels.

The point is that we can all see patterns in it. Me, I see curves and channels and little clumps. If they were stars, I’d think the clumps were star clusters, gravitationally bound to each other.

But they’re not. They’re random. The patterns we see are self-deception. This is related to an activity many stats teachers have used, in which the students are to secretly record a set of 100 coin flips, in order, and also make up a set of 100 random coin flips. The teacher returns to the room and can instantly tell which is the real one and which is the fake. It’s a nice trick, but easy: students usually make the coin flips too uniform. There aren’t enough streaks. Real randomness tends to have things that look non-random.

Here is a snap from a classroom activity: Continue reading The Index of Clumpiness, Part One

Problem Archetypes

I bet somebody has written a book about this, but I’m unaware of it, so here goes. Stop me if you know, and put me out of my misery.

Jason Buell just posted about how interesting it is when we (or students) don’t go to the question we expect in a given situation, and how important it is for us to break set. For example, when you have nine supreme-court justices and they start shaking hands, every math teacher in the room knows to ask, “how many handshakes altogether?”

It’s vital that we learn to ask other questions. But this post is not about that.

Rather, let us observe that the “handshake problem” is an example of what I’m gonna call a problem archetype. It’s part of our mathematical maturity (I claim) that we have a fistful of these that we can bring out and use; and we do, because they’re useful. It may be that other problems have the same mathematical structure, or that it illustrates an important principle, or some other reason I haven’t thought of.

In any case, it’s part of the shared culture. We refer to it in shorthand in order to communicate with one another or to remind ourselves. It often has a name, as in, “the handshake problem,” or, to name another, “the Monty Hall problem.” (I happen to dislike the Monty Hall problem for the classroom, but I still think it’s archetypal.)

So:

  • What are these? Can we start a list?
  • What role do they actually play in problem-solving?
  • Are they, ultimately, a positive influence? Or do they shackle us?

Just to get things started, here are some other archetypes:

  • Boat in a river. Is this actually an archetypal problem, or just a common situation in problems in Algebra texts? Does that matter? We all recognize “boat-in-a-river” problems as a particular genre.
  • Seven bridges of Königsberg
    A view of the city with bridges marked.

    The seven bridges of Königsberg. When I first saw a map of the city, I was astonished at the shapes of the rivers. Of course, topology is topology, but still!

  • That problem where you cut two squares out of a chessboard, from opposite corners, and then try to cover the board with dominoes.
  • Speaking of chessboards, the one where you get one grain of wheat for the first square, and then double every time.

You get the idea.