Creating histograms

This tutorial covers the steps for creating simple histograms in StatCrunch. To begin, load the Exam Scores data set, which will be used throughout this tutorial. This data set contains only one column of data containing 23 exam grades for an introductory Statistics course.

Creating a histogram with frequency on the y-axis

To create a histogram of the data in the Exam 2 column, choose the Graph > Histogram menu option. Select the Exam 2 column and click Compute!. By default, StatCrunch will automatically bin the data and plot the frequency (count) of each bin on the y-axis. The resulting histogram shown below has a starting point of 50 and a bin width of 5, so the bins are 50 to 55, 55 to 60, 60 to 65 and so on. StatCrunch creates non-overlapping bins by including the left edge of the bin and excluding the right edge. In this example, a score of 80 falls into the 80 to 85 bin, not the 75 to 80 bin. The y-axis in this case indicates frequencies between 0 and 7 for the bins with the largest frequency of 7 for the 80 to 85 bin.

Creating a histogram with relative frequeny or density on the y-axis

StatCrunch can also compute the relative frequency (proportion) or density associated with each bin on the y-axis. Density refers to dividing each proportion by the bin width for a total area of one under the histogram. These items are available under the Type option. For example, in the window containing the resulting histogram above, choose Options > Edit to reopen the histogram dialog window. Change the Type option to Relative Frequency and click Compute!. The resulting histogram below shows the relative frequency of each bin. The 80 to 85 bin has a relative frequency of 7/23 = 0.304.

Changing the starting point and/or width of the bins

Under the Bins option, StatCrunch offers the ability to change the binning structure for the histogram. The Start at option sets the location of the left edge of the first bin for the histogram while the Width option specifies how wide to make the bins. For this example, in the window containing the resulting histogram above, choose Options > Edit to reopen the histogram dialog window. Change Width to 10 and press Compute!. Now the bins in the resulting histogram shown below are 50 to 60, 60 to 70, 70 to 80 and so on.

Adding a distribution overlay

Under Display options StatCrunch offers a list of distributions that can be overlaid on the histogram as a way of comparing the displayed data to the chosen distribution. For this example, choose Options > Edit to reopen the histogram dialog window. For Overlay distrib., select Normal to add a normal distribution to the histogram. Note StatCrunch will then offer the option to specify the parameters for the distribution or, by default, StatCrunch will estimate the best fitting parameters from the data. In this case, StatCrunch will use the sample mean and sample standard deviation of the scores in the Exam 2 column for the Normal distribution overlay. Click Compute!, and the resulting histogram below has a red curve overlaid reflecting the best fitting Normal distribution with its mean and standard deviation shown in the graph title.

Displaying values above the bars

StatCrunch can also be used to display the tallied numerical values above each of the bars. As an example, in the window containing the resulting histogram above, choose Options > Edit to reopen the histogram dialog. When the dialog window reappears, turn on the Value above bar option by checking the associated box under Display options and then click Compute!. The resulting histogram shown below now displays the relative frequencies above each bin. Note a value may be suppressed if there is not enough room to display it above the corresponding bar.

Always Learning
Pearson