**Confidence Interval Project**

**Mail Print Twitter Facebook**

In this report, I demonstrate the role of "level of confidence" in confidence intervals. Confidence intervals are used to indicate the reliability of an estimate, and how likely the interval is to contain the parameter is determined by the level of confidence.

First, I generated 100,000 Bernoulli random variables. A Bernoulli random variable represents the value of a categorical variable as a series of 0s and 1s with 0 representing failure (individual does not have the characteristic) and 1 representing failure (individual does have the characteristic). Then I constructed 1000 95% confidence intervals for a population proportion. In my simulation, 964 of the 1000 intervals capture the population proportion. So, 964/1000 = 96.4% of the 1000 confidence intervals capture the population proportion. So, when we speak of a 95% confidence interval, the "level of confidence" represents the proportion of all possible intervals that will capture the unknown parameter.

Next, I used a data set of 32 people who were asked how many days a week they exercise, and obtained a 95% confidence interval for the mean. To do that, I had to make sure that the requirements of constructing a confidence interval are met. That is, the sample size must be large, or the population must be normally distributed:

1. If n ≤ 0.05N and np(1-p)≥10, the shape of the distribution is approximately normal, or

2. The shape of the distribution will be approximately normal if the population is not normal but sample size is large (n≥30).

Because the sample size is 32, this data set meets the requirement. Because we do not know the standard deviation of the population, I had to compute the t-statistics. I found that the confidence interval was (2.191215, 3.808785), and we can be 95% sure that the true value of the population mean number of days that people exercise per week falls within these values.

-Lauren Bibeau

**Result 1: One sample Proportion with data**[Info]

95% confidence interval results:
Outcomes in : Bernoulli1 Success : 1 Group by: Sequence p : proportion of successes Method: Standard-Wald New column, Sequence, added to data table! New column, Count, added to data table! New column, Total, added to data table! New column, Sample Prop., added to data table! New column, Std. Err., added to data table! New column, L. Limit, added to data table! New column, U. Limit, added to data table! |

**Result 2: One sample T statistics with data**[Info]

95% confidence interval results:
μ : mean of Variable New column, Variable, added to data table! New column, Sample Mean, added to data table! New column, Std. Err., added to data table! New column, DF, added to data table! New column, L. Limit, added to data table! New column, U. Limit, added to data table! |

**Data set 1. simulated data**[Info]

**Data set 2. Exercise Habits**[Info]

CommentsAlready a member? Sign in.

Byaadom81May 1, 2010

this i9s getting interesting .so what proportion was captured in that interval?

BysalmanzgApr 27, 2010

I should also add that we can flip the interpretation of 0s and 1s, as long as we understand that the proportion of population will be flipped too. (0 vs 1 corresponding to 25% vs 75%)

BysalmanzgApr 27, 2010

My understanding is that 1 represents success and 0 represents failure. Its the same reason why we give 1 as success on the dialog when we generate confidence interval. (Salman)

Byjbaugh57Apr 27, 2010

I am so confused as to how both 0 and 1 can be both failures. Many people have that on their reports. If Bernoulli only administers 0's and 1's-- that means everything is a failiure? I'm not sure if it was a typo or not... Can someone help me understand this better? Either way, Lauren, good job on your report; you could have helped visually express the normalcy of the distribution, by using a histogram or boxplot; overall, it was organized well and easy to understand. Excellent!