Report Properties
Thumbnail:

from Flickr
Created: Oct 20, 2015
Share: yes
Views: 9033

Results in this report

Data sets in this report

Need help?
To copy selected text, right click to Copy or choose the Copy option under your browser's Edit menu. Text copied in this manner can be pasted directly into most documents with formatting maintained.
To copy selected graphs, right click on the graph to Copy. When pasting into a document, make sure to paste the graph content rather than a link to the graph. For example, to paste in MS Word choose Edit > Paste Special, and select the Device Independent Bitmap option.
You can now also Mail results and reports. The email may contain a simple link to the StatCrunch site or the complete output with data and graphics attached. In addition to being a great way to deliver output to someone else, this is also a great way to save your own hard copy. To try it out, simply click on the Mail link.
Sampling Distribution for a Proportion

In this report we are looking specifically at the sampling distribution of a proportion.  Sex is either male or female (by X and Y chromosomes).  It is spread about evenly through the population although we observe slightly more women.  So I generated a random sample of 10,000 indivduals. Let's look at the distribution now:

Result 1: Histogram Population Proportion   [Info]

As you can see, with a proportion, you either have a triat or you don't. For instance, "are you male?" This leads to two modes in the population, yes = 1 and no = 0. However, when we take samples and average the yeses and noes, we see a familiar pattern:

Result 2: Histogram Sample Proportion size 10   [Info]

With a sample size of 10, we see the normal distribution starting to form as hte CLT manifests in the sampling distribution.  Now we see that the standard deviation has gone from .5 to .15. Why?

Well, as we saw before the variation decreases as the sample size increases.  The formula is a bit different though.  We calculate it with:

s.d.(pˆ) = √(p(1-p)/n), which in this case is s.d.(pˆ) = √(.50(1-.50)/10) = √(.25/10) = √(.025) = .158

which is very close to the observed value in the above sampling distribution.

if I increase the sample size to 50, we see the following:

Result 3: Histogram Sample Proportion size 50   [Info]

where the expected s.d. = √(.50(1-.50)/50) = √(.25/50) = √(.005) = .071

and lastly, we increase the sample size to 100, with the following result:

Result 4: Histogram Sample Proportion size 100   [Info]

where the expected s.d. = √(.50(1-.50)/100) = √(.25/100) = √(.0025) = .05

so the Central Limit Theorem is working and

the deviations are changing according to s.d.(pˆ) = √(p(1-p)/n)

and this makes us happy!

NOTE: the p(1-p) = p - p^2 , so we have a squre root of a squared term just like with means.

Result 5: Compute Expression sd 100   [Info]
 New column, sqrt(.501 * (1 - .501) / 10), added to data table!

Data set 1. Sampling Distribution for Proportion   [Info]