Print - Back

Forbes - America's Best Colleges, 2010 Project Part 2
Generated Nov 12, 2017 by vta15

Forbes - America's Best Colleges, 2010 Owner: cdcummings12

<data1>

<result1>

Comment on your scatter plot:

- The 2 quantitative variables that I used were Cost and Total Student Population.

- My scatterplot displays a weak negative correlation. The x was the total student population and the y was the cost. The data is very scattered, however it does seem that the less students there are, the more expensive the university is.

-There are no outliers in this data.

-An appropriate significance level for this 0.05. Thr margin of error is not very small and not very large for the data, so the significance level would be 0.05 and a confidence interval of 95%.

<result2>

Line of Best Fit:

y= cost

x= total student population

y= 46225.923 - 0.029189982x

or Cost = 46225.923 - 0.029189982 Total Student Population

Comment on your correlation coefficient.

R (correlation coefficient) = -0.031224727.  This shows that my data set is very weak negative, practically no correlation.

Are your terms significant?

Cost and Total Student Population are significant.

<result3>

Interpret your line of best fit

The line of best fit shows that my data is slightly negative.  A lot of the points are at the top left side of the plot, and very few lead to the bottom right of the plot, so since there are so many at the top left, the line of best fit almost looks like a perfect horizontal line, but is slightly negative.  This shows how it is a very weak negative correlation. This means that most of the best universities on the list have a low student population and a high cost.

Interpret r^2

R-sq = 0.00097498359. The R^2 is very small so this shows that the line of best fit is not a strong representation of the data. This is due to the fact that the majority of the schools on the Forbes Best Colleges in the USA for 2010 have a low student population and a high cost.

Correlation and Causation

The Cost and Total Student Population are correlated and causation. The majority of schools that were ranked best in the USA according to Forbes in 2010 were schools that have a low student population. This allows for a more hands-on learning experience and helps makes there colleges as great as they are. However, the smaller the school means they have a more personalized education so this can cause the cost to be higher. Except for 3 colleges, the others were all private colleges which in general tend to have small classroom sizes and a high cost, so therefore the data is correlated and has causation.

Extra Credit: QQ Plots

<result4>

Do your expected values follow a normal distribution?

Yes, they followed normal distribution.

<result5>

Yes, they follow normal distribution.

Result 1: Scatter Plot 1   [Info]

Result 2: Simple Linear Regression 1   [Info]
Simple linear regression results:
Dependent Variable: Cost
Independent Variable: Total Student Population
Cost = 46225.923 - 0.029189982 Total Student Population
Sample size: 98
R (correlation coefficient) = -0.031224727
R-sq = 0.00097498359
Estimate of error standard deviation: 9809.4697

Parameter estimates:
ParameterEstimateStd. Err.AlternativeDFT-StatP-value
Intercept46225.9231222.731 ≠ 09637.805472<0.0001
Slope-0.0291899820.095364722 ≠ 096-0.306087850.7602

Analysis of variance table for regression model:
SourceDFSSMSF-statP-value
Model19015363.39015363.30.0936897710.7602
Error969.2376667e996225695
Total979.2466821e9

Result 3: Line of Best Fit Scatter   [Info]

Result 5: QQ Plot Student Pop   [Info]