**Descriptive Statistics EbL report by Ebeltran**

**Mail Print Twitter Facebook**

I chose to perform my analysis on new cases of cancers by state in 2018.

I first ran a summary statistics and obtained the following results:

### Summary statistics:

Column | n | Mean | Variance | Std. dev. | Std. err. | Median | Range | Min | Max | Q1 | Q3 | IQR | Coef. of var. | Mode |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

All Sites | 51 | 34026.667 | 1.3230552e9 | 36373.825 | 5093.3543 | 25080 | 175350 | 2780 | 178130 | 8600 | 37250 | 28650 | 106.898 | No mode |

Data was obtained from American Cancer Society, Cancer Facts & Figures 2018 at cancer.org and reported on all 50 states, including the District of Columbia, for a total of 51 observations.

The average or mean number of new cases of cancer by state in 2018 was 34,026.7.

The median number of new cancer cases was 25,080, which was influenced by extreme values of California 178,130, Florida 135,170, New York 110,800, Pennsylvania 80,960, and Texas 121,860. This would suggest a right skewed distribution as shown on the boxplot below.

I did not find a mode in my summary statistics.

The standard deviation is about 36,373.8 of new cancer cases in each state. This mean the standard deviation is larger than the mean. The variance of 1.32 is the standard deviation squared.

The range of the data is 175,350, which would mean there was 175,350 new cases between the smallest number of new cases per state of 2780 and the largest number of new cases per state of 178,130.

The five number summary is (2780, 8600, 25080, 37250, 68470) which would be in my boxplot below. The IQR is 28,650, showing that 50% of new cancer cases per state have a range of 28,650 cases. The other 50% of data is below 8600 and or above 25080 new cancer cases per state.

The boxplot below shows a right skewed of distribution with a longer right whisher which also shows the outliers.

Since there was outliers, I reran my analysis without the outliers present.

### Summary statistics:

Column | n | Mean | Variance | Std. dev. | Std. err. | Median | Range | Min | Max | Q1 | Q3 | IQR | Coef. of var. | Mode |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

All Sites | 46 | 24096.522 | 3.2909954e8 | 18141.101 | 2674.7594 | 20135 | 65690 | 2780 | 68470 | 8450 | 35520 | 27070 | 75.285143 | No mode |

I can see that the statistics have shrunk. The mean dropped to 24096.5, the standard deviation was down to 18,141.1, the range decreased by 109,660 and the IQR has decreased by 1,580 to 27,070.

The new boxplot still shows a right skewed in the distribution.

It was shocking to find out that in Illinois a total of 66,330 new cases of cancer were diagnosed. I was apart of this total which is why this data interest me. I was diagnosed in February of 2018 with High graded B cell Lymphoma. By analyzing this data I learned that new cases of cancer are diagnosed a lot more often than I imagined.

**Result 1: New cases of cancer by state 2018**[Info]

**Data set 1. New cases for selected cancers by state, US, 2018**[Info]

**HTML link:**

CommentsAlready a member? Sign in.