A dataset is presented that contains Cortisol level determinations made on two samples of women at childbirth. The subjects contained in Group 1 underwent c-sections following induced labor and sujects in Group 2 delivered either vaginally or by c-section following spontaneous labor. In using the sample data, we will determine if an inference can be made that the mean Cortisol level is different for mothers undergoing two types of childbirth.
In determining if a t-test is appropriate for this dataset, we must determine if the assumptions can be met by answering the following 3 questions:
1. Are the samples independent and random of interval/ratio data?
2. Is the interval/ratio data normally distributed?
3. Are the population variances assumed equal?
The first assumption is met when looking at the samples and type of data used for this set. The samples are random and independent because there is no connection between Group 1 with induced labor and Group 2 with spontaneous labor. The Cortisol level is an interval/ratio variable whose mean we would like to compare in the two groups. The cortisol level is presented in equal intervals with the presence of a zero point making all mathematical operations to that data possible. These characteristics make that variable interval/ratio in nature.
Next, we need to determine if the interval/ratio is normally distributed. This can be done by constructing histograms and evaluating their shapes, skewedness and sample size.
In evaluating the shape of these histograms, we can see they appear a little different. In Group 1, the distribution appears left skewed, with the majority of results clustered between 375 and 450 towards the right side of the distribution. The typical measurement is 450-475, with a minimum value of 300 and maximum of 475. In Group 2, the distribution appears somewhat right skewed, with the bulk of measurements grouped to the left. It is difficult to determine the normality of distribution in such a small sample size. Group 1, n=10 and Group 2, n=12. We can also look at a comparison bewteen the mean and median of each group using the summary statistics.
Summary statistics for Cortisol:
Group by: Group
Looking at the mean and median of both groups, the mean is less than the median. In Group 1, the mean = 406.3 and median= 418.5. In Group 2, the mean = 684.67 and median = 688.5. When the mean is less than the median, it indicates a left skewed distribution. In this example, the difference is relatively small, particularly in Group 2. I will make the assumption that the data is normally distributed, however the t-test will be applied with caution.
Finally, the third assumption that must be met is equal population variances. To do this, we need to look at the standard deviations for each group. Group 1 SD = 56.93 and Group 2 SD = 71.81. By taking the smaller standard deviation and multiplying by 2 (56.93 X 2 = 113.86), you can determine these variances are equal because the calculated number is greater than the larger standard deviation. If that number was smaller, then an alternate form t-test would be performed.
In conclusion, by satisfying the assumptions listed above, a t-test is appropriate for this data.