StatCrunch logo (home)

Report Properties
Thumbnail:
Owner: websterwest
Created: Jul 13, 2007
Share: yes
Views: 8685
Tags:
 
Results in this report
 
Data sets in this report
 
Need help?
To copy selected text, right click to Copy or choose the Copy option under your browser's Edit menu. Text copied in this manner can be pasted directly into most documents with formatting maintained.
To copy selected graphs, right click on the graph to Copy. When pasting into a document, make sure to paste the graph content rather than a link to the graph. For example, to paste in MS Word choose Edit > Paste Special, and select the Device Independent Bitmap option.
You can now also Mail results and reports. The email may contain a simple link to the StatCrunch site or the complete output with data and graphics attached. In addition to being a great way to deliver output to someone else, this is also a great way to save your own hard copy. To try it out, simply click on the Mail link.
Salary report
Mail   Print   Twitter   Facebook

The purpose of this example report is to illustrate what is commonly called Simpson's paradox. I have used it in my courses at Texas A&M to illustrate some of the more subtle aspects of data analysis. The data set attached below, salary.txt, contains information from a fictitious salary survey conducted at a university. This fake survey collected information on Gender (Male or Female), Major (Education or Engineering) and Salary (in $s) from 2232 recent graduates. There are a number of interesting questions one could ask about this data, but in this report I will focus on the following: Does the survey data show any evidence of gender based salary discrimination?

As a first approach to answering this question, compare the boxplots of salaries for each gender shown in Result 1. Result 1 was constructed using the Graphics > Boxplot option in StatCrunch with the salary column grouped by gender. A boxplot displays the 5-number summary (minimum, first quartile, median, third quartile, maximum) for a data set. Clearly, the salaries for males tend to be much higher than those of females.

Result 1: Boxplot of salary by gender   [Info]
Right click to copy

The summary statistics of the salaries for each gender are shown in Result 2. Result 2 was constructed using the Stat > Summary Stats > Columns option in StatCrunch with the salary column grouped by gender. Comparing the medians in Result 2, the typical male earns $54,471 while the typical female earns only $36,369. These values are represented by the lines within each box in Result 1.

Result 2: Summary statistics for salary by gender   [Info]
Summary statistics for Salary:
Group by: Gender
Gender n Mean Variance Std. Dev. Std. Err. Median Range Min Max Q1 Q3
Female 1088 41107.555 9.7633984E7 9880.991 299.56152 36369 31209 33070 64279 35514 37634.5
Male 1144 50588.812 8.6189224E7 9283.815 274.48175 54471 32506 29027 61533 52079.5 56060

One might initially think this is strong evidence that women are suffering from discrimination in terms of salary, but let's look a bit further. To understand the salary differences between the genders within the education field, consider the boxplots shown in Result 3. Result 3 was constructed in the same manner as Result 1 but with the where field set to Major = Education. Interestingly, we see that females tend to earn more than males in the education field.

Result 3: Boxplot of salary by gender for education majors   [Info]
Right click to copy

The summary statistics in Result 4 bear this out even further with the median salary within education for females being $36,009 and the median salary for males within education being $32,001.50. The first quartile of female salaries (the left edge of the female box in Result 3) is even above the third quartile of male salaries (the right edge of the male box in Result 3) in the education field.

Result 4: Summary statistics for salary by gender for education majors   [Info]
Summary statistics for Salary:
Where: Major = Education
Group by: Gender
Gender n Mean Variance Std. Dev. Std. Err. Median Range Min Max Q1 Q3
Female 856 36008.645 1004212.4 1002.104 34.25121 36009 6341 33070 39411 35324 36709.5
Male 220 31970.781 1238722.4 1112.979 75.03703 32001.5 6581 29027 35608 31180 32707.5

Likewise, to understand the salary differences between the genders within the engineering field, consider the boxplots shown in Result 5. Result 5 was constructed using the where field set to Major = Engineering with the boxplot option. Once again, we see that females tend to earn more than males in the engineering field.

Result 5: Boxplot of salary by gender for engineering majors   [Info]
Right click to copy

The summary statistics in Result 6 bear this out even further with the median salary within engineering for females being $59,994 and the median salary for males within engineering being $55,019. As was the case with the education field, the first quartile of female salaries (the left edge of the female box in Result 5) is even above the third quartile of male salaries (the right edge of the male box in Result 5) in the engineering field.

Result 6: Summary statistics for salary by gender for engineering majors   [Info]
Summary statistics for Salary:
Where: Major = Engineering
Group by: Gender
Gender n Mean Variance Std. Dev. Std. Err. Median Range Min Max Q1 Q3
Female 232 59920.78 3900453.5 1974.9565 129.66225 59994 10681 53598 64279 58679.5 61279
Male 924 55021.676 4146587 2036.317 66.989914 55019 13514 48019 61533 53657 56454

At first glance, it might appear strange that overall females tend to earn less than males but within each major females tend to earn more than males. The reason for this is actually quite simple. Consider the two way cross classification shown in Result 7. Result 7 was constructed using the Stat > Tables > Contingency > with data option in StatCrunch. In this result, we see roughly the same number for each gender in the data (1088 females, 1144 males), but the large majority of females (856) were in the lower paying field of education while the large majority of males (924) were in the higher paying field of engineering. The fact that the genders are not equally distributed across major causes this paradox.

Result 7: Two way table for gender and major   [Info]
Contingency table results:
Rows: Gender
Columns: Major
Education Engineering Total
Female 856 232 1088
Male 220 924 1144
Total 1076 1156 2232

Statistic DF Value P-value
Chi-square 1 789.2597 <0.0001

Data set 1. Fictitious Salary Data for Recent Graduates   [Info]
To analyze this data, please sign in.

HTML link:
<A href="https://www.statcrunch.com/5.0/viewreport.php?reportid=7">Salary report</A>

Comments
Want to comment? Subscribe
Already a member? Sign in.

Always Learning