Simple Sampling Distribution Simulation in Fathom

What we're looking for. Result from 500 runs of the simulation.

Yesterday’s APstat listserve had a question about Fathom:

How do I create a simulation to run over and over to pick 10 employees.  2/3 of the employees are male

Since my reply to the listserve had to be in plain old text, I thought I’d reproduce it here with a couple of illustrations…

There are at least two basic strategies. I’ll address just one; this is the quick one and uses random number-ish functions. The others mostly use sampling. If you use TPS, I think it’s Chapter 8 of the Fathom guide that explains them in excruciating detail 🙂

Okay: we’re going to pick 10 employees over and over from a (large) population in which 2/3 of the employees are male.

(Why large? To avoid “without-replacement” issues. If you were simulating layoffs of 10 employees from an 18-employee company, 12 of whom were male, you would need to use sampling and make sure you were sampling without replacement.)

(1) Make a collection with one attribute, sex, and 10 cases

(2) Give the sex attribute this formula:

randomPick( “male”, “male”, “female”)


The "if" version of the formula as see in Fathom's formula editor.

(I like this formula best for kids early in the course; for an alternative, start typing:

if ( random( ) < (2/3) ) or if ( random( ) < 0.67 )
and then when the brackets show up, put “male” in the top (true) slot and “female” in the bottom. See the illustration. )

(3) The sex values fill in with random genders in the right proportions. Test by choosing Rerandomize from the Collection menu.

NOW presumably you want to create a sampling distribution of the number of males in the collection.

(4) Double-click the collection to open its Inspector. Go to the Measures tab. Measures are like attributes of the whole collection: statistics that summarize the set of data, rather than values about individuals

The "Measures" panel in the inspector. We have just defined Nmales, and there are 6 of them.

(5) Make a <new> measure called something like Nmales. Double-click its “Formula” box and give it a formula like this one:

Count( SEX = “male” )

Notice how it fills in and how, when you rerandomize, the number changes.

(6) With the collection selected, choose Collect Measures from the Collection menu. A new collection appears, called “Measures from <whatever>”.

(7) Make a table to see that now you have 5 values of Nmales. Cool. You want more!

Here is where you change how many times you simulate.

(8) In the Collect Measures panel of the new collection’s inspector (it’s open by default) you can change the number of measures—the number of re-randomizations–you collect. You can also turn off animation if you don’t want to wait. On the other hand, if you graph the distribution of Nmales, you can watch it grow if you have animation on.

In any case, when you’re done you have a distribution like the one in the graph at the top of the post, and you can ask the important questions such as, how likely is it that all 10 chosen employees would be male if the whole thing were done by chance alone? Answer: not bloody.

Advertisements

Published by

Tim Erickson

Math-science ed freelancer and sometime math teacher. In 2014–15, at Mills College in Oakland, California.

One thought on “Simple Sampling Distribution Simulation in Fathom”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s