Super Heroes
This data set originally came from the following website:
https://www.kaggle.com/claudiodavi/superheroset.
It contains various physical characteristics for over 700 fictional comic book super heroes.  statcrunch_featured  Aug 1, 2018  47KB  4059 
Median sales price vs Median rent for housing in 50 cities
This data is obtained from Zillow and includes the median sales price and the median price for to rent a home in 50 cities, as of July 2018, taken from https://www.zillow.com/research/localmarketreports/
This will be an excellent data set to use to introduce correlation and regression.
Can we predict the median rent in a city based on the median price of homes sold in the city? It is also a good example to discuss the effect of outliers.  rosenthi  Sep 11, 2018  1KB  500 
Oscar nominations
The data set contains the year, category, nominee and whether they won for all academy award nominations from 1927 through 2006. A good exercise is to use StatCrunch to compute the number of nominations for each actress in the ACTRESS (all caps) category along with the first and last year they were nominated.  websterwest  Mar 10, 2008  450KB  2777 
Major League Players Elected to Hall of Fame as Players
Includes 2018 BBWAAelected inductees Chipper Jones, Vladimir Guerrero, Jim Thome, and Trevor Hoffman and Veterans Committeeelected inductees Jack Morris and Alan Trammell. 31 variables for each player. Team=primary team; BBWAA=Baseball Writers Association of America; Bat: R=right, L=left, B=both;
WAR=Wins Against Replacement: number of wins the player added to the team above what an "average" replacement player would add.
CS=caught stealing.
OPS=Onbase Plus Slugging; as a rule of thumb, a "good" OPS is a value that when divided by 3 results in a value that would be considered a "good" batting average.
Other variables are hopefully selfexplanatory.  treiland  Feb 4, 2018  36KB  4682 
Advanced NBA Statistics for 20132014 Season
N = 342; only players with at least 40 games played are included.
These are advanced metrics which attempt to evaluate, relatively speaking, how good an NBA basketball player was during the 20132014 (in which Kevin Durant won the MVP Award).
Variables..........Position  what position did they play?..... Age  How old was the player as of February 1, 2014?..... Team  Obvious..... PER  Player Efficiency Rating; a measure of perminute production standardized such that the league average is 15.....
TS  True Shooting Percentage; a measure of shooting effeciency that takes into account 2point field goals, 3point field goals, and free throws.....
ORB  Offensive Rebound Percentage; an estimate of the percentage of available offensive rebounds a player grabbed while he was on the floor.....
DRB  Defensive Rebound Percentage; an estimate of the percentage of available defensive rebounds a player grabbed while he was on the floor.....
TRB  Total Rebound Percentage; an estimate of the percentage of available rebounds a player grabbed while he was on the floor.....
AST  Assist Percentage; an estimate of the percentage of teammate field goals a player assisted while he was on the floor.....
STL  Steal Percentage; an estimate of the percentage of opponent possessions that end with a steal by the player while he was on the floor.....
BLK  Block Percentage; an estimate of the percentage of opponent twopoint field goal attempts blocked by the player while he was on the floor.....
TOV  Turnover Percentage; an estimate of turnovers per 100 plays.....
USG  Usage Percentage; an estimate of the percentage of team plays used by a player while he was on the floor.....
ORtg  Offensive Rating: An estimate of points produced (players) or scored (teams) per 100 possessions.....
DRtg  Defensive Rating: An estimate of points allowed per 100 possessions.....
OWS  Offensive Win Shares; an estimate of the number of wins contributed by a player due to his offense.....
DWS  Defensive Win Shares; an estimate of the number of wins contributed by a player due to his defense.....
WS  Win Shares; an estimate of the number of wins contributed by a player.
 daniel.inghram  May 22, 2014  33KB  3639 
Survey: Is college worth it?
This data set was collected via a StatCrunch survey. Respondents were asked if they think college is a good financial decision, if they currently attend or have attended college, their gender and their age.
Check out the original survey here: http://www.statcrunch.com/5.0/survey.php?surveyid=3007&code=OYAVB&groupid=256
Feel free to copy this survey and use for your own data collection.  statcrunchhelp  Apr 10, 2014  22KB  1804 
Happiness Data from GSS.xls
These data come from the 2008 General Social Survey. A subset of 190 respondents were selected at random from the full data set. Children = number of children. Education is highest year of education (e.g., 12 = High School; 16 = Bachelors, etc.). Happy: 1 = Not too happy, 2 = Pretty Happy, 3 = Very Happy. Health: 1 = Poor, 2 = Fair, 3 = Good, 4 = Excellent. Income: 1 = Under $1000; 2 = $10002999; 3 = $30003999; 4 = $40004999; 5 = $50005999; 6 = $60006999; 7 = $70007999; 8 = $80009999; 9 = $1000012499; 10 = $1250014999; 11 = $1500017499; 12 = $1750019999; 13 = $2000022499; 14 = $2250024999; 15 = $2500029999; 16 = $3000034999; 17 = $3500039999; 18 = $4000049999; 19 = $5000059999; 20 = $6000074999; 21 = $75000$89999; 22 = $90000$109999; 23 = $110000$129999; 24 = $130000$149999; 25 = $150000+. Married: 0 = No, 1 = Yes. Religious: 1 = Not religious, 2 = Slightly religious, 3 = Moderately religious, 4 = Very religious.  jacobgsimons  Apr 20, 2010  5KB  3732 
Diamond Ring Prices.xls
The source of the data is a full page advertisement placed in the Straits Times newspaper issue of February 29, 1992, by a Singaporebased retailer of diamond jewelry.
The advertisement contained pictures of diamond rings and listed their prices, diamond content, and gold purity. Only 20K ladies' rings, each mounted with a single diamond stone, were considered for this study. 20K rings are made with gold of 20 carat purity. (Pure gold is rated as 24K.)
There were 48 such rings of varying designs. The weights of the diamond stones ranged from 0.12 to 0.35 carats (a one carat diamond stone weighs 0.2 gram) and were priced between $223 and $1086. The jewelry store adopted a fixedprice policy.
How Is Jewelry Priced?
In Singapore, the pricing of gold jewelry is simple. The price equals the current market value of the gold content (i.e., weight times the going rate per gram of gold) plus a craftsmanship fee.
However, the pricing of other jewelry like diamond rings is more complicated because they are not as standardized as gold jewelry. The price of diamond jewelry depends on the four C's: caratage, cut, colour, and clarity of the diamond stone. A good cut gives a diamond more sparkle. Colourless diamonds are the most prized. A flawless diamond has maximum clarity because the passage of light is unimpeded through the stone. Cut, colour, and clarity are subjective factors and are very hard for the layman to gauge.
 craig_slinkman  Apr 22, 2010  586B  2452 
Alcohol data from adults
My group and I design a survey to find out among the adult who drinks , why they drink, their age, education level and how many drink they have per day. The data was gathered individually and put together into statcrunch by one member of the group. This survey shows the number of drinking adults and what motivate them to drink. Our survey question is below.
1. Do you Drink Alcohol? Circle one: Y N
2. What is your age?____years
3. What is your gender? Circle one: Male Female
4. Are you having an increasing number of
A. Financial problems
B. family problems
C. Work problems
D. Health problems
E. Financial and family problems
F financial, health and family problems
G. Family and work problems
H. Financial, Family, and work problems
I. none of the above
Circle one.
5. How many drinks do you have a week?_____ drinks
6. Education: What is the highest degree or level of school you have completed? If currently enrolled, mark the previous grade or highest degree received.
A. No schooling completed
B. Nursery school to 8th grade
C. 9th, 10th or 11th grade
D. 12th grade, no diploma
E. High school graduate  high school diploma or the equivalent (for
example: GED)
F. Some college credit, but less than 1 year
G. 1 or more years of college, no degree
H. Associate degree (for example: AA, AS)
I. Bachelor's degree (for example: BA, AB, BS)
J. Master's degree (for example: MA, MS, MEng, MEd, MSW, MBA)
K. Professional degree (for example: MD, DDS, DVM, LLB, JD)
Circle one.
 Original Message 
Sent on:Tuesday, May 22, 2012 11:46 PM
Hi. It looks good.
Change:
2. What is your gender? Circle one: Male Female Other
to2. What is your gender? Circle one: Male Female Other
Since I do not think you will get someone answering as Other.
In #3, I forgot another option:3. Are you having an increasing number of
A. Financial problems
B. family problems
C. Work problems
D. Financial and family problems
E. financial and family problems
F. Family and work problems
G. Financial, Family, and work problems
H. none of the above
Circle one.
 rosesege  Jun 21, 2012  9KB  4637 
Credit Scores
This file has two variables:
Performing
A value of 0 indicates that the loan is not a performing loan and that the borrower is either behind payments or is in default.
A value of 1 indicates that the borrower is in good standing.
Credit_Score
Is the borrower's credit score at the time the loan was given.
 craig_slinkman  Apr 2, 2010  5KB  1007 
diamonds.csv
This is a very large data set showing various factors of over 50,000 diamonds including price, cut, color, clarity, etc.
price: price in US dollars ($326–$18,823)
carat: weight of the diamond (0.2–5.01)
cut: quality of the cut (Fair, Good, Very Good, Premium, Ideal)
color: diamond colour, from J (worst) to D (best)
clarity: a measurement of how clear the diamond is (I1 (worst), SI1, SI2, VS1, VS2, VVS1, VVS2, IF (best))
x: length in mm (0–10.74)
y: width in mm (0–58.9)
z: depth in mm (0–31.8)
depth: total depth percentage = z / mean(x, y) = 2 * z / (x + y) (43–79)
table: width of top of diamond relative to widest point (43–95)  hbarker2  Feb 19, 2016  3MB  1884 
Baseball data for correlation and regression
This table shows the total number of runs scored, at bats, hits, etc for each of the 30 MLB teams for the 20092011 seasons.
////
Correlations and linear regression models can be calculated between the different numeric variables. A good exercise is to see which variables correlate most strongly with runs_scored.
////
As emphasized in the movie Moneyball, some of the classic metrics such as batting_avg is not as good as the newer metrics like OBP (on base percentage), SLG (slugging percentage), or OPS (on base plus slugging).
////
A guide to a few of the variables that may not be self explanatory.
Runs_Scored: The total of all runs (points) the baseball team scored by the end of the season.
Batting_avg: This is equal to the number of hits divided by at_bats
OBP: On Base Percentage. Similar to batting average, except that it takes into account walks and hitbypitch. Some players who don't have high batting averages, manage to get walked quite frequently.
SLG: Slugging  This weights hits to first base as 1 point, hits to second base as 2 points, third as 3, homeruns as 4, and divides the total by the number of at bats.
OPS  On Base Plus Slugging  this is just OBP added to the SLG numbers.  mileschen  Apr 17, 2012  6KB  3741 
Direct Loans
62 colleges were randomly selected (Cecil College thrown in for good measure). Data collected from U.S. government, 4th quarter 2010. We have number of subsidized loans and the total loan value for each school.  cecil_college  Jun 4, 2014  3KB  3992 
Responses to Is college worth it?
This data set was collected via a StatCrunch survey. Respondents were asked if they think college is a good financial decision, if they currently attend or have attended college, their gender and their age.
Check out the original survey here: http://www.statcrunch.com/5.0/survey.php?surveyid=3007&code=OYAVB&groupid=256
Feel free to copy this survey and use for your own data collection.
 scsurvey  Apr 24, 2012  22KB  792 

