StatCrunch logo (home)

Data sets shared by StatCrunch members
Showing 1 to 15 of 942 data sets matching NUMBER
Data Set/Description Owner Last edited Size Views
Christmas tree sales: Real vs. Fake 2004-2016
This data set contains the number of real and fake Christmas trees sold in the US between 2004 and 2016.
statcrunch_featuredNov 13, 2018398B2824
Movie Budgets and Box Office Earnings (Updated Spring 2018)
This data all comes from the following website the tracks the financial performance of movies:
http://www.the-numbers.com/movie/budgets/all

The “Budget”, “Domestic Gross”, and “Worldwide Gross” columns each are in millions of dollars.

statcrunch_featuredOct 4, 2018270KB12860
New York City Leading Causes of Death (2007-2014)
This data set breaks down the leading causes of death in New York City between 2007-2014. Included is the number of Deaths (Deaths) for each combination of Sex and Race Ethnicity. The Death Rate represents the rate within that Sex/ Race Ethnicity category. Age Adjusted Death Rate adjusts the Death Rate by the ages of those who died.
statcrunch_featuredAug 1, 201896KB3908
Flight Delay Data For July 2014
This data set contains information on the flight delays for each airline at each U.S. airport in July of 2014. The columns include the carrier, airport city/state, airport code, airport name, total number of flights (Flights), the number of delayed flights (Delayed), the number of cancelled flights (Cancelled), the number of diverted flights (Diverted), the number of on-time flights (On-time), and the on-time percentage (On-time Percentage).
statcrunch_featuredJan 2, 201888KB6004
All MLB Salaries (1985-2015)
This data has all MLB player salaries between 1985-2015 including the team played for, the city, and a unique ID for each player. Total this includes 25,575 salaries for 4,963 different baseball players.
The player ID is the first 5 letters from the last name, followed by the first two letters from the first name, followed by a number in case of duplicate names. For example, bondsba01 stands for Barry Bonds with "01" because he's the first with the "bondsba" name ID.
statcrunch_featuredJun 27, 20171MB4975
Cereal Brands
Data on several variable of different brands of cereal. Number of cases: 77 Variable Names: Name: Name of cereal mfr: Manufacturer of cereal where A = American Home Food Products; G = General Mills; K = Kelloggs; N = Nabisco; P = Post; Q = Quaker Oats; R = Ralston Purina type: cold or hot calories: calories per serving protein: grams of protein fat: grams of fat sodium: milligrams of sodium fiber: grams of dietary fiber carbo: grams of complex carbohydrates sugars: grams of sugars potass: milligrams of potassium vitamins: vitamins and minerals - 0, 25, or 100, indicating the typical percentage of FDA recommended shelf: display shelf (1, 2, or 3, counting from the floor) weight: weight in ounces of one serving cups: number of cups in one serving rating: a rating of the cereals
statcrunch_featuredApr 3, 20174KB7229
Major League Players Elected to Hall of Fame as Players
Includes 2019 BBWAA-elected inductees Mariano Rivera, Edgar Martinez, Roy Halladay, and Mike Mussina. 31 variables for each player. Team=primary team; BBWAA=Baseball Writers Association of America; Bat: R=right, L=left, B=both; WAR=Wins Against Replacement: number of wins the player added to the team above what an "average" replacement player would add. CS=caught stealing. OPS=On-base Plus Slugging; as a rule of thumb, a "good" OPS is a value that when divided by 3 results in a value that would be considered a "good" batting average. Other variables are hopefully self-explanatory.
treilandJan 25, 201937KB5592
J Pribe auto-mpg.xlsx
The data set covers 12 years of vehicles and contains 398 individual entries. The data describes popular consumer vehicle’s miles per gallon (MPG), the number of engine cylinders, total engine size (displacement), engine horsepower, the vehicle weight, a measure of acceleration (0-60 MPH time), the model year of the vehicle (1970-1982), a coded identifier for the place of origin, and the make and model of the vehicle. MPG, number of cylinders, engine size, horsepower, weight, acceleration time, and model year are all numerical values. The vehicle origin, full name, make, and model are categorical. This data was chosen to meet the assignment requirements, and because cars are cool. *Origin data code: 1=USA, 2=Europe, 3=Japan. The "car name" variable was broken into additional make and model variables to ease analysis, a change from the original data set.
jpribeFeb 16, 201932KB256
The Unofficial 2014 NFL Player Census - S. Lohse
This data set is a shared data set owned by websterwest and last updated May 2015. It is being used for examples in my classroom. Original Description: This data set contains a number of variables on every NFL player participating in the 2014 season. Most of the variables should be self explanatory. Salary represents the average annual salary for the player under their existing contract. Exp represents years of experience. Pro Bowler represents the number of years the player was selected for the pro bowl. Champ provides the number of championship teams on which the player has played. Heisman represents whether or not the player won the Heisman trophy in college.
slohse9395Feb 11, 2019321KB211
NFL Player Data 2016
This file lists the 2,764 NFL players for all team rosters as of July 22, 2016. Information includes jersey number, name, position, age, height (in inches), weight (in lbs), years in the NFL, college they graduated from, NFL team, position grouping (OL, QB, tailback, TE, WR, Front 7, DB, special teams), side of the football (offensive, defense or special teams), and their experience level by years played.
ppoconnoAug 27, 2018220KB2138
Oscar nominations
The data set contains the year, category, nominee and whether they won for all academy award nominations from 1927 through 2006. A good exercise is to use StatCrunch to compute the number of nominations for each actress in the ACTRESS (all caps) category along with the first and last year they were nominated.
websterwestMar 10, 2008450KB3142
The Unofficial 2014 NFL Player Census
This data set contains a number of variables on every NFL player participating in the 2014 season. Most of the variables should be self explanatory. Salary represents the average annual salary for the player under their existing contract. Exp represents years of experience. Pro Bowler represents the number of years the player was selected for the pro bowl. Champ provides the number of championship teams on which the player has played. Heisman represents whether or not the player won the Heisman trophy in college.
websterwestMay 5, 2015321KB1942
Movie Budgets and Box Office Earnings (Updated Fall 2016)
This data all comes from the following website the tracks the financial performance of movies:
http://www.the-numbers.com/movie/budgets/all

The “Budget”, “Domestic Gross”, and “Worldwide Gross” columns each are in millions of dollars.

ntorno8Jun 30, 2017266KB5914
Times World University Rankings (2011-2016)
This data comes from the annual Times magazine rankings of universities across the world. The webpage for the Times 2016 rankings is listed above in the source.
The formula for the 2016 rankings is as follows:
30% for Teaching Rating
7.5% for International Outlook Rating
30% for Research Rating
30% for Citations Rating
2.5% for Industry Income Rating.
The “Total Score” from 2016 can be recreated using this formula.

ColumnDescription
World_RankUniversity rank for a given year
University_NameThe name of the university
CountryLocation of university
Teaching_Rating Rating from a 0-100 scale of the quality of teaching at the university. This rating is based on the institution’s reputation for teaching, it’s student/staff ratio, it’s PhD’s/ undergraduate degrees awarded ratio, and it’s institutional income/ academic staff ratio.
Inter_Outlook_Rating Rating from a 0-100 scale of the international makeup of a university. This rating is based the international student percentage, international staff percentage, and the percentage of research papers from the university that include at least one international author.
Research_Rating Rating from a 0-100 scale of quality of research at the university. This rating is based on the university’s reputation, it’s research income/ academic staff ratio, and it’s production of scholarly papers.
Citations_Rating Rating from a 0-100 scale of based on the normalized average of citations by other papers per paper from the university (how often the research from the university is cited by other papers).
Industry_Income_Rating Rating from a 0-100 scale grading how much companies are willing to invest in the universities research. The rating is calculated based on the research income from businesses per academic staff member.
Total_ScoreThe final score used to determine the university ranking based on Teaching_Rating, International_Outlook_Rating, Research_Rating, Citations_Rating, and Industrial_Income_Rating.
Num_StudentsTotal number of students in a given year
Student/Staff_RatioNumber of students per academic staff member
%_Inter_StudentsPercentage of student body who come from a foreign county
%_Female_Students Percentage of student body that is female.
YearAcademic year that the ranking was released. For example, 2016 denotes the 2015-2016 academic year.
statcrunchhelpApr 5, 2016254KB3906
Responses to Social Media Survey
Respondents provided their most used social media application (Media App), how many minutes they spent on social media per day (Time spent), the number of times they visited social media per day (Visits per day), the number of posts they make per week (Posts per week), their gender (Gender), and their age (Age).
scsurveyOct 24, 2017196KB3525

1 2 3 4 5 6 7 8 9 10   >

Always Learning