Last time, we discussed random and not-so-random star fields, and saw how we could use the mean of the minimum distances between stars as a measure of clumpiness. The smaller the mean minimum distance, the more clumpy.
What other measures could we use?
It turns out that the Professionals have some. I bet there are a lot of them, but the one I dimly remembered from my undergraduate days was the “index of clumpiness,” made popular—at least among astronomy students—by Neyman (that Neyman), Scott, and Shane in the mid-50s. They were studying Shane (& Wirtanen)’s catalog of galaxies and studying the galaxies’ clustering. We are simply asking, is there clustering? They went much further, and asked, how much clustering is there, and what are its characteristics?
They are the Big Dogs in this park, so we will take lessons from them. They began with a lovely idea: instead of looking at the galaxies (or stars) as individuals, divide up the sky into smaller regions, and count how many fall in each region.
But Bob Hayden has recently pointed out that the bootstrap not particularly good, especially with small samples. And Real Stats People are generally more suspicious of the bootstrap than they are of randomization (or permutation) tests.
It’s such a joy when my daughter asks for help with math. It used to happen all the time; it’s rare now. She just started medical school, and had come home for the weekend to get a quiet space for concentrated study.
“Dad, I have a statistics question.” Be still, my heart!
“It’s asking, if you have a random mRNA sequence with 2000 base pairs, how many times do you expect the stop codon AUG to appear? How do you figure that out?”
I got her to explain enough about messenger RNA so that I could picture this random sequence of 2000 characters, each one A, U, G, or C, and remembered from somewhere that a codon was a chunk of three of these.
“I think it’s more of a probability, or combinatoric question than stats…” I said. (I was wrong about that; interval estimates come up later. Read on.)