Creating boxplots with a group by column
This tutorial covers the steps for creating boxplots in StatCrunch. To begin, load the Asking prices for 4-bedroom homes in Bryan-College Station TX data set, which will be used throughout this tutorial. The data set was collected in order to compare four-bedroom homes listed for sale in the two adjoining cities of Bryan, Texas, and College Station, Texas. Using a real estate web site, fifteen homes were randomly selected from four-bedroom homes listed for sale in Bryan, Texas, and fifteen homes were randomly selected from four-bedroom homes listed for sale in College Station, Texas. The Sqft column contains the square footage for each home, and the Location column lists the city where the home is located.
Creating boxplots with a grouping column
Boxplots can be used to compare the square footages of the four-bedroom homes listed for sale in Bryan to those listed for sale in College Station. To construct the boxplots, choose the Graph > Boxplot menu option. In this case, both sets of values to be plotted are in Sqft and the values are to be grouped based on the value of the Location column. With this idea in mind, select the Sqft column and then specify Location under Group by. Click Compute! to view the resulting boxplots shown below. By default, StatCrunch constructs boxplots based on the 5-number summary of each set of data. Each boxplot displays the minimum value, first quartile, median, third quartile and maximum value in each group.
Identifying outliers
Boxplots are sometimes used as a tool to display outliers in a set of data values. in such cases, the lower extreme of the boxplot is defined as the largest data value above the lower hinge value (1.5 X IQR below the first quartile), and the upper extreme is defined as the smallest data value above the upper hinge (1.5 X IQR above the third quartile). StatCrunch allows for the construction of this type of modified boxplot. For example, in the window containing the resulting boxplots above, choose Options > Edit to reopen the dialog window. Under Other options, check the box next to the Use fences to identify outliers option and click Compute!. The modified boxplots shown below indicate a potential outlier among the homes listed for sale in Bryan. This corresponding home in the data set can be identified by clicking and dragging the mouse around this point in the plot. The corresponding row in the data table will then be highlighted as shown below. This highlighting can be cleared using the Clear button in the row selection navigation tool that appears in the lower left hand corner.
Changing orientation
StatCrunch also allows for the boxplots to be drawn horizontally as opposed to vertically. For this example, choose Options > Edit to reopen the boxplot dialog window. Under Other options, check the box next to the Draw boxes horizontally option and click Compute!. The new boxplots shown below then have the horizontal orientation with a numeric scale on the x-axis.

Always Learning