Last time I described an idea about how to use matrices to study simple weather models. Really simple weather models; in fact, the model we used was a two-state Markov system. And like all good simple models, it was interesting enough and at the same time inaccurate enough to give us some meat to chew on.
I used it as one session in a teacher institute I just helped present (October 2019), where “matrices” was the topic we were given for the five-day, 40-contact-hour event. Neither my (excellent!) co-presenter Paola Castillo nor I would normally have subjected teachers to that amount of time, and we would never have spent that much time on that topic. But we were at the mercy of people at a higher pay grade, and the teachers, whom we adore, were great and gamely stuck with us.
One purpose I had in doing this session was to show a cool use for matrices that had nothing to do with solving systems of linear equations (which is the main use they have in their textbook).
- Just running the model and recording data was fun and very important. Teachers were unfamiliar with the underlying idea, and although a few immediately “got it,” others needed time just to experience it.
- Making the connection between the randomness in the Markov model and thinking about natural frequencies did not appear to cause any problem. I suspect that this was not an indication of understanding, but rather a symptom of their not having had enough time with it to realize that they had a right to be confused.
- The diagram of the model was confusing.
Let’s take the last bullet first. The model looked like this:
Of course I also explained it in words: “We will all start on a sunny day. To find the next day’s weather, roll one die; if you get a 1–5, tomorrow will be sunny. If you get a 6, it will rain. If it’s raining, you look over here to find the next day’s weather. Notice how it’s different…” and so forth.
But the numbers on the arrows flummoxed some teachers. One improvement might be to put the numbers closer to the circles that represent the states, like this:
Another idea we came up with was to make spinners, one for each state:
When we hit on the spinners, the teachers’ faces got calmer, and they clearly liked the idea for their students. But they never used them; they kept using dice. It may be that just thinking about the spinners clarified what the dice were doing. After all, the dice are faster—once you know what’s going on. So the teachers were very happy simulating a month—30 days of weather—using dice.
The pooled data from these simulations framed the central question: what’s the overall probability of a sunny day in this model? The data themselves gave an empirical answer to the question, but we pushed them to think about a theoretical answer.
This was problematic. Their background in probability was weaker than we expected. I had hoped that we could lead them to try a tree diagram, and quickly find out why that was going to be problematic. But the idea of a tree diagram was brand new to some of these teachers.
Actually talking through the natural-frequency approach seemed to help, but of course we can’t be sure how deep their understanding was.
What do we mean by a “natural-frequency approach?” Suppose we had six different people running this model; after one day, what do you expect? (The six “tomorrows” will be five sunny days and one rainy day; to do two days in the future, start with 36 people, as described in that previous post.) This approach makes sense, although the astute reader will realize that this suggests that random events are more systematic than they really are.
Getting to Matrices
That approach turned out to be a good way to introduce matrices. Let’s represent the 6 people, currently having sunny weather, as a vector with the number of sunny days on the top and the number of rainy on the bottom:
Now we want a matrix that “evolves” this situation to the next day. The next day will be [5, 1] as we just discussed. So we want a matrix that will be like this:
You can kind of do this one spot at a time by inspection, or use variables, e.g.,
That gives us the first column; by starting with 6 rainy days, we can figure out the second column:
Which gives us the matrix we used in that previous post.
Once you enter that matrix in the Desmos matrix calculator (currently in beta), you can really go to town. We can use the matrix (and powers of the matrix) to evolve any collection of weather into the future any number of days. And we find that no matter where you start, the result quickly converges to the same values.
Like, suppose we have 100 simulations; what are the expected number of sunny days after various numbers of days? If we start with all sunny days, the matrices look like this:
Calculation results are like this:
and we quickly converge to:
which is really cool. Apparently the probability of a sunny day, in the long run, is 4/5.
We can even record the number of sunny days out of 100 simulators after k days, depending on how many sunny days (out of 100) we start with:
This is all fun, but what’s the point?
Part of it is a big exercise in combining conditional probabilities—probabilities that arise from the situation, from the model that we have decided to analyze.
After all, how many sunny tomorrows do we expect? Take a deep breath:
It’s the number of simulations having sunny days right now times the probability that it will be sunny tomorrow given that it is sunny today, plus the number of rainy simulations times the probability that it will be sunny tomorrow given that it is rainy today. In equationspeak,
In the big picture, we have taken all that verbiage and condensed it into equations. Those particular equations, constructed for this situation, also happen to exemplify one of the ways you have to think about combining probabilities.
But then, after pondering them a long time and making sure we understand what the heck is going on, we look at the equations and notice some things. Those probabilities? We have every combination of rainy or sunny given rainy or sunny. The coefficients are attached to the probabilities systematically. And the formulas for rainy and sunny are really the same except for substituting some letters in the right places. That is, the calculations in the equations are repeated, systematic calculations, combined in a particularly useful—and common—way.
So we invent a way to shorten them further:
Where X is the vector [S, R] of sunny and rainy days, and A is the matrix of transition probabilities in our Markov model.
And that I think is a vital reason to study matrices (or, more generally, linear algebra) in this era: the operations encapsulate repeated, systematic calculation. It turns out that they have other beautiful properties as well, and can help you think about mathematics more deeply. (We also did a couple of sessions about transformations in the plane, and symmetry groups, which also invoke repeated, systematic calculation made more elegant through matrices.) But the fact that we can write all that logic so concisely is immensely powerful, and leads immediately to powerful technology and data-analytic thinking.
It also explains, I think, why the teachers were, in general, so flummoxed. It’s powerful and deep, so it takes time to wrap your mind around it.
And that brings up the curricular question. What do high-school students need to know, if anything, about matrices? I think it would be a mistake to limit them (as our teachers’ texts do) to solving linear systems. And I think it would be a shame to waste too much time finding inverses by hand. If it were up to me, I’d do stuff like this, plus the transformations and symmetry. Adjacency. Search rankings in a microscopic internet. In those contexts, you get powers, inverses, closed systems, identity, all the good stuff.
Or I might let it wait ’til college. But it is really cool.
A few teachers, the most enchanted, spent some additional time with me creating a three-state weather system: sunny, cloudy, and rainy. We designed a weather model where to get from sun to rain, you have cloudy days in between. They figured out the 3×3 transition matrix, and we used technology to project it into the future until it converged. It was very cool, and really fun just to run the model with our dice.