Creating split and stacked bar plots

This tutorial covers the steps for creating split and stacked bar plots in StatCrunch. To begin, load the Two Categorical Variables data set, which will be used throughout this tutorial. This toy data set contains only two columns of data. The data in the var1 column contains 10 total values with the value b in the first four rows and the value a in the last six rows. The data in the var2 column contains six c values and four d values. Split and stacked bar plots can be used to summarize the association between the data in these two columns. In this case, the six a values are paired with four c values and two d values. The four b values are paired with two c values and two d values. This data set is in raw form in that it is not summarized with values in one column and counts/frequencies in another column. To construct split and stacked bar plots with data in summary form see the StatCrunch Chart Column(s) procedure.

Creating a split bar plot

To create a split bar plot (also called a side-by-side bar plot) of the data in the var1 column cross-classified with the data in the var2 column, choose the Graph > Bar Plot > with data menu option. Select the var1 column, choose var2 as the Group by column, and click Compute!. The resulting bar plot below shows each of the the unique values of var1 on the x-axis with pairs of blue and red bars above them. The heights of these bars correspond to the frequency of the associated pairings of these values with each of the values from the var2 column. As the legend of the plot indicates, the frequency of the pairings with c values are shown in blue, and the frequency of the pairings with d values are show in in red.

Creating a stacked bar plot

The bars in the above bar plot can be stacked rather than split by adjusting the Grouping options. In the window containing the resulting bar plot above, choose Options > Edit to reopen the bar plot dialog window. Change the Grouping options to Stack bars and click Compute!. The resulting bar plot below now has the red bars stacked on top of the corresponding blue bar for each unique value of the var1 column. This type of bar plot is useful if an emphasis is to be placed on the total percent (frequency) of each unique value of the column selected in addition to the cross-classification. The other remaining option listed under the Grouping options will produce a separate bar plot of each column selected for each unique value of the Group by column.

Creating a plot with relative frequeny or percent on the y-axis

StatCrunch can plot statistics other than the frequency of each pairing on the y-axis with a number of options such as relative frequency and percent available under the Type option. In the window containing the resulting bar plot above, choose Options > Edit to reopen the bar plot dialog window. Change the Type option to Percent and click Compute!. The resulting bar plot below now shows the percent of the total of 10 pairs falling into each cross-classification. As an example, 40 percent of the data fell into the a-c pairing. Note that the heights of all the bars in plot sum to the expected total of 100%.

Plotting relative to category totals

StatCrunch can also normalize the values plotted relative to the total within a specific category, which is defined as a unique value on the x-axis. In this example, consider a plot that shows the percentage breakdown of c and d values within each of the a and b categories. In the window containing the resulting bar plot above, choose Options > Edit to reopen the bar plot dialog window. Select Split bars under the grouping options, change the Type option to Percent (within category) and click Compute!. The resulting bar plot below shows that two-thirds (66.6%) of the a values are paired with c values while one-third (33.3%) are paired with d values. The plot also shows a 50-50 split between c and d pairings for b values. Note that stacking the bars in the plot below would result in a somewhat less informative plot where the combined bar heights would reach 100% for both the a and b categories.

Always Learning
Pearson