Descriptive Statistics EbL - Ethan Norris
I chose to perform my analysis on historical life expectancy in the United States.

This data spans from 1929 to 2015 and is sourced from the CDC's 'National Vital Statistics Reports,  Vol. 67, No. 7, November 13, 2018'.

Running summary statistics after removing 1 outlier on this data obtained the following results:

Summary statistics

ColumnnMeanVarianceStd. dev.Std. err.MedianRangeMinMaxQ1Q3IQRModeCoef. of var.
Life Expectancy 86 71.418605 29.113062 5.3956521 0.58182813 71.3 20.4 58.5 78.9 68.4 75.8 7.4 70.2 7.5549671

The min life expectancy was 58.5 years in 1936 while the max was 78.9 years in 2014, giving a range of 20.4 years, a very significant increase given humanity's history.

The median is 71.3 years and the mode is 70.2 years (seen in 1968, 1966-64, and 1961). The variance is 29.11 and standard deviation is 5.40, indicating the difference between years in this data is not very large.

The 5 number summary for this data is:

Min: 58.5, Q1: 68.4, Med: 71.3, Q3: 75.8, Max: 78.9



The boxplot's distribution is rather focused compared to the min / max values from this data, with an IQR between the life expectancies of 68.4 and 75.8.

The 20.4 year range, or life expectancy increase over the entire period, represents the significant advancements in health and science humanity has accomplished over these years. This change has an obvious positive outcome but also brings with various negative consequences, such as those outliving their retirement savings and increased strain on the economy, including social security and pension plans.

