Data Moves and Simplification

or, What I should have emphasized more at NCTM

I’m just back from NCTM 2018 in Washington DC where I gave a brief workshop that introduced ideas in data science education and the use of CODAP to a very nice group in a room that—well, NCTM and the Marriott Marquis were doing their best, but we really need a different way of doing technology at these big conferences.

Anyway: at the end of a fairly wide-ranging presentation in which my main goal was for participants to get their hands dirty—get into the data, get a feel for the tools, have data science on their radar—it was inevitable that I would feel:

  • that I talked too much; and
  • that there were important things I should have said.

Sigh. Let’s address the latter. Here is a take-away I wish I had set up better:

In data science, things are often too complicated. So one step is to simplify things; and some data moves, by their nature, simplify.

Complication is related to being awash in data (see this post…); it can come from the sheer quantity of data as well as things like being multivariate or otherwise just containing a lot of stuff we’re not interested in right now.

To cut through that complication, we often filter or summarize, and to do those, we often group. To give some examples, I will look again at the data that appeared in the cards metaphor post, but with a different slant.

Here we go: NHANES data on height, age, and sex. At the end of the process, we will see this graph:

nhanes 800 means
Mean height by sex and age; 800 cases aged 5–19. NHANES, 2003.

And the graph tells a compelling story: boys and girls are roughly the same height—OK, girls are a little taller at ages 10–12—but starting at about age 13, girls’ heights level off, while the boys continue growing for about two more years.

We arrived at this after a bunch of analysis. But how did we start?

Continue reading Data Moves and Simplification

Model Shop! One volume done!

The Model Shop, Volume 1Hooray, I have finally finished what used to be called EGADs and is now the first volume of The Model Shop. Calling it the first volume is, of course, a treacherous decision.

So. This is a book of 42 activities that connect geometry to functions through data. There are a lot of different ways to describe it, and in the course of finishing the book, the emotional roller-coaster took me from great pride in what a great idea this was to despair over how incredibly stupid I’ve been.

I’m obviously too close to the project.

For an idea of what drove some of the book, check out the posts on the “Chord Star.”

But you can also see the basic idea in the book cover. See the spiral made of triangles? Imagine measuring the hypotenuses of those triangles, and plotting the lengths as a function of “triangle number.” That’s the graph you see. What’s a good function for modeling that data?

If we’re experienced in these things, we say, oh, it’s exponential, and the base of the exponent is the square root of 2. But if we’re less experienced, there are a lot of connections to be made.

We might think it looks exponential, and use sliders to fit a curve (for example, in Desmos or Fathom. Here is a Desmos document with the data you can play with!) and discover that the base is close to 1.4. Why should it be 1.4? Maybe we notice that if we skip a triangle, the size seems to double. And that might lead us to think that 2 is involved, and gradually work it out that root 2 will help.

Or we might start geometrically, and reason about similar triangles. And from there gradually come to realize that the a/b = c/d trope we’ve used for years, in this situation, leads to an exponential function, which doesn’t look at all like setting up a proportion.

In either case, we get to make new connections about parts of math we’ve been learning about, and we get to see that (a) you can find functions that fit data and (b) often, there’s a good, underlying, understandable reason why that function is the one that works.

I will gradually enhance the pages on the eeps site to give more examples. And of course you can buy the book on Amazon! Just click the cover image above.


Talking is so not enough

We’re careening towards to the end of the semester in calculus, and I know I’m mostly posting about stats, but this just happened in calc and it applies everywhere.

We’ve been doing related rate problems, and had one of those classic calculus-book problems that involves a cone. Sand is being added to a pile, and we’re given that the radius of the pile is increasing at 3 inches per minute. The current radius is 3 feet; the height is 4/3 the radius; at what rate is sand being added to the pile?

Never mind that no pile of sand is shaped like that—on Earth, anyway. I gave them a sheet of questions about the pile to introduce the angle of repose, etc. I think it’s interesting and useful to be explicitly critical of problems and use that to provoke additional calculation and figuring stuff out. But I digress.

Continue reading Talking is so not enough

Coming (Back) to Our Census

Reflecting on the continuing, unexpected, and frustrating malaise that is Math 102, Probability and Statistics, one of my ongoing problems has been the deterioration of Fathom. It shouldn’t matter that much that we can’t get Census data any more, but I find that I miss it a great deal; and I think that it was a big part of what made stats so engaging at Lick.

So I’ve tried to make it accessible in kinda the same way I did the NHANES data years ago.

This time we have Census data instead of health. At this page here, you specify what variables you want to download, then you see a 10-case preview of the data to see if it’s what you want, and then you can get up to 1000 cases. I’m drawing them from a 21,000 case extract from the 2013 American Community Survey, all from California. (There are a lot more cases in the file I downloaded; I just took the first 21,000 or so so we could get an idea what’s going on.)

Continue reading Coming (Back) to Our Census

Blood in the Aisles

I don’t quite know how Beth does it! We’re using Beth Chance and Allan Rossman’s ISCAM text, and on Thursday we got to Investigation 1.6, which is a cool introduction to power. (You were a .250 hitter last season; but after working hard all winter, you’re now a .333 hitter. A huge improvement. You go to the GM asking for more money, but the GM says, I need proof. They offer you 20 at-bats to convince them you’ve improved beyond .250. You discover, though the applets, that you have only a 20% chance of rejecting their null, namely, that you’re still a .250 hitter.)

I even went to SLO to watch Beth Herself run this very activity. It seemed to go fine.

But for my class, it was not a happy experience for the students. There was a great deal of confusion about what exactly was going on, coupled with some disgruntlement that we were moving so slowly.

A number of things may be going on here: Continue reading Blood in the Aisles

Purposes, Objectives, Goals, Rubrics, ack!

This is all about discomfort. Mine.

In Jason Buell’s post, he reported on an ASCD virtual conference. I’m so glad he did, because I’m afraid that if I had been there, I would have come off the rails. And I would have felt guilty.

I know I’m not alone among teachers that I have a hard time remembering the difference between goals and objectives, but I suspect that I AM alone in that among teachers that I respect. Somehow, even though I can get behind the vocab and the changes to day-to-day teaching practices that speakers are trying to promote, keeping track of the details and distinctions eludes me.

Which is embarrassing because for many years I was the consultant, creating vocabulary and rubrics and systems, and speaking to teachers about how to improve their day-to-day teaching. I was passionate about the spectrum of tasks that run from exercises to problems to investigations (circa 1988) and the four-fold thingy that adorned the 1992 California Framework (whose four central ideas I cannot remember, let me grab that old Framework from the shelf…), ah yes, mathematically powerful students (remember mathematical power?) use mathematical thinking, mathematical tools and techniques, mathematical ideas, and communication.

Anyway, in Jason’s post I was doing fine until—because I’m so passionate about modeling—I clicked on the link for a modeling and purpose rubric and got that churning feeling in my stomach whenever I have to open a Word doc—no! not that! —whenever I see a big chart with four or five columns with earnest, similar-but-meaningfully-different chunks of narrowly-set text that shouts rubric!

rubric graphic

At this point, when I recovered my composure enough to read a little, I discovered that of course we have a vocabulary collision: I’m interested in the kind of modeling that’s a mathematical practice in the once-new Core Standards, but this is from the S part of ASCD, not the CD part, and we’re talking about the teaching practice of modeling for students rather than the modeling that’s using mathematics to represent and understand something in the real world.

When I calmed down, I tried a different link in the same post for “Gradual Release of Responsibility” which has the excellent acronym GRR, and was taken to a post in John Golden’s “Math Hombre” blog, where (without reading anything) I saw this graphic:

gradual-release graphic

And even without reading the surrounding text or even the little text in the graphic itself, I could grok the main point: we do all of this, and move from left to right gradually. This is a good reminder about how to plan a year or plan an individual lesson, and how to make a decision about a “teacher move” I might make, for example, actually decide on the fly whether to answer a student’s question or not.

Whereas I’m really not going to look at that rubric in order to see how to prepare a lesson. It may indeed be good for supervisors to help them as they observe me. But it is, though its vocabulary and style, removed from instructional decisions. For example: if I got a 3-Proficient on “The essential lesson elements of guided, collaborative, and independent tasks accurately reflect the established purpose,” and I wanted to get a 4-Exemplary, I’d be hard pressed to know what it would mean that “All tasks that students actually complete throughout the lesson reflect the content and language purposes.” In fairness, I might be able to parse that temporarily, while I was sitting down with an observer, but the sentence is so packed with requirements that I would not be able to hold them all in my mind when I plan the next lesson. (Not most tasks, all tasks. Not tasks they have begun, only tasks they complete. Not only tasks at the end, but all tasks throughout the lesson. Not just content, but content and language.)

What do I make of this?

First, I have some pathetic form of ADD where I gravitate towards a simple graphic rather than making the effort to read and understand text. After a career of making nuanced suggestions, I love the quick visual slogan. There is something noble in this, but it’s also deeply troubling.

But (and this may be the main point) I wonder: how and when does a good idea morph from being useful into being a system that is too hard for me to apply?

And let me add a related post-script: when does Dan Meyer’s excellent trope on perplexity become Perplexity™ the brand? I worry that the advent of perplexity scores is a dangerous step along that road.