This data set shows us the differences between height and shoe size in a sample size of 199. The height will be our explanatory variable, and shoe size will be our dependent variable.
Observed below, is a scatter plot of the data shown above. As signified in the scatter plot, the explanatory variable (height), and response variable (foot size) have a positive correlation. This is known because while one goes up, the other goes up as well at a constant rate. We can also see that there are some possible outliers in the data. It seems that as height increases, so does foot size for the most part. As well, the correlation value is higher than what the critical value is, also proving a positive association.

Correlation between Height and Foot is:
0.6221983 
Next we can see the simple linear regression summary. Here we can see a lot of different information. We are told that the linear correlation coefficient is r = 0.6222. Since r is closer to +1 we have stronger evidence that there is a positive association. We also find here that the least squares regression line is yhat = 5.6680 + 0.2337. The Rsq value is 0.3871, which means only 38.71% of the variability is accounted for on the fitted line.
Simple linear regression results:
Dependent Variable: Foot Independent Variable: Height Foot = 5.6679645 + 0.23370141 Height Sample size: 199 R (correlation coefficient) = 0.6222 Rsq = 0.38713068 Estimate of error standard deviation: 1.4213197 Parameter estimates:
Analysis of variance table for regression model:
Predicted values:
Residuals stored in new column, Residuals. 

In the residual scatter plot shown below it can be determined there is no real pattern. They data is a tad crowded, but that could be from the large sample number. Since there is no distinct pattern to the graph, it can be determined that a linear model is appropriate for this data.

The histogram of residuals below shows some interesting information. The shape for the most part seems evenly distributed. There seems to be a slight tail to the right, but doesn’t seem to be of concern. For the most part, the shape seems to show a bell shape, which shows an even distribution of the data. From this graph stronger evidenced is provided that a linear model is appropriate.

Shown below is the boxplot of residuals. The boxplot shows that there are a few outliers in the data. But, the median for the most part is in the middle; slightly to the left of the box. Both tails seem to be the same size. The few outliers have made the data slightly skewed right.

Courtney Pearson
Already a member? Sign in.
Mar 11, 2013
Well done.
Mar 3, 2013
Great report, Courtney! I was surprised to see the correlation was not closer to 1. I would have thought it'd be almost perfect! I, too, wonder if the data would change if the measurements were taken by a doctor. Many people estimate their heights. I do not see the source for your data. Does it say how it was obtained?  Miranda Sorensen
Mar 2, 2013
This data is pretty interesting. I would have assumed that the average person would have feet sized in proportion to their heights. I wonder how much the data would change if the data was collected throught measurement instead of through each person volunteering this information. I know that many people would likely falsify their heights, and if we were talking about weight that would be an even larger lurking variable.
Jeremiah Bly