StatCrunch logo (home)

Data sets shared by StatCrunch members
Showing 1 to 15 of 144 data sets matching bar
Data Set/Description Owner Last edited Size Views
FIFA World Cup Match Results (1930-2014)
This data set records all World Cup Men's soccer matches played between 1930 and 2014. Included is the date of the match, the location, the World Cup Stage (Stage), both teams, the halftime score, the final score, and the attendance for the game.
statcrunch_featuredAug 1, 2018102KB1317
FIFA World Cup Mens Players 2018
This data set records information for all 736 players for the 2018 FIFA World Cup. Included for each player is their national team (Team) along with their club team (Club).
statcrunch_featuredAug 1, 201863KB2359
Super Heroes
This data set originally came from the following website: https://www.kaggle.com/claudiodavi/superhero-set. It contains various physical characteristics for over 700 fictional comic book super heroes.
statcrunch_featuredAug 1, 201847KB4059
US Presidential Election History
This dataset tracks the US presidential election results dating back to 1824. Included is the winning candidate, winning party, popular voting totals, margin of victory, and the electoral college totals. Also included is the name and party of the runner-up along with the percentage of all eligible voters that turned out for the election (Voter Turnout Percentage).
statcrunch_featuredFeb 20, 20185KB2293
All MLB Salaries (1985-2015)
This data has all MLB player salaries between 1985-2015 including the team played for, the city, and a unique ID for each player. Total this includes 25,575 salaries for 4,963 different baseball players.
The player ID is the first 5 letters from the last name, followed by the first two letters from the first name, followed by a number in case of duplicate names. For example, bondsba01 stands for Barry Bonds with "01" because he's the first with the "bondsba" name ID.
statcrunch_featuredJun 27, 20171MB4403
All Texas Executions from 1982-2015
This data set records all executions in Texas from 1982-2015 and comes from the following website: Texas Executions. The data includes a variety of information about each execution including their last statement.
statcrunchhelpJan 7, 2016242KB2354
Titanic.xlsx
Report on the Loss of the ‘Titanic’ (S.S.) (1990), British Board of Trade Inquiry Report (reprint), Gloucester, UK: Allan Sutton Publishing. Taken from the Journal on Statistical Education Archive, submitted by rdawson@husky1.stmarys.ca. Dr. Craig Slinkman has recoded the data as self-explanatory nominal variables. yes craig_slinkman Mar 23, 2010 68KB 5
craig_slinkmanMar 23, 201061KB2131
California Home Prices, 2009
This dataset is a collection of real estate listings from San Luis Obispo county, California, and some locations around it from 2009. The prices are their list price at the creation of this dataset. For more information about this data, go to the website source listed above.
statcrunchhelpMar 11, 201646KB2120
Average Years of Schooling for Adults by Country
Average years of schooling for adults for 100 countries from NationMaster.com
bartonpoulsonJan 20, 20112KB1362
All MLB Salaries (1985-2015)
This data has all MLB player salaries between 1985-2015 including the team played for, the city, and a unique ID for each player. Total this includes 25,575 salaries for 4,963 different baseball players.
The player ID is the first 5 letters from the last name, followed by the first two letters from the first name, followed by a number in case of duplicate names. For example, bondsba01 stands for Barry Bonds with "01" because he's the first with the "bondsba" name ID.
statcrunchhelpMar 15, 20161MB1485
titanic_full.xls
VARIABLE DESCRIPTIONS: survival Survival (0 = No; 1 = Yes), pclass Passenger Class (1 = 1st; 2 = 2nd; 3 = 3rd), name Name, sex Sex, age Age, sibsp Number of Siblings/Spouses Aboard, parch Number of Parents/Children Aboard, ticket Ticket Number, fare Passenger Fare, cabin Cabin, embarked Port of Embarkation (C = Cherbourg; Q = Queenstown; S = Southampton), boat Lifeboat, body Body Identification Number home.dest Home/Destination.
swhardyOct 25, 2015110KB1278
Seating Choice versus GPA (For 3 rows, with Text and Indicator Columns)
This dataset contains hypothetical (I believe) data on GPA for students who sit in the front, middle, and back rows of a classroom, as well as a hypothetical gender variable. The data are shown using both text variables (e.g., "front" and "middle") and 0/1 indicator variables for the row and gender variables. This dataset is useful for demonstrating the different ways that StatCrunch can compare means based on two factors: (a) the text factor columns can be used in a two-way ANOVA; and (b) the 0/1 indicator columns can be used in multiple regression. (Because of StatCrunch's current limitation on equal cells, the 0/1 variables only use the first and middle rows.) Both procedures gives the same p-value and same conclusion (as long as the interaction term is centered), thus highlighting the similarity of statistical procedures and StatCrunch's flexibility.
bartonpoulsonApr 7, 20101KB5225
Seating Choice versus GPA (Stacked & Split Columns for Front & Back Rows)
This dataset contains hypothetical (I believe) data on GPA for students who sit in the front and back row of a classroom. The data are shown in several ways: (a) two separate columns (one for the front row GPA and another or the back row GPA); (b) stacked with one column to indicate front or back row and another column with the GPAs; and (c) the row column repeated as a 0/1 indicator variable. This dataset is useful for comparing the different ways that StatCrunch can compare the means of two groups: (a) The two columns of scores (front and back) can be used in the 2-sample t-test or a one-way ANOVA; (b) the stacked text column (front/back) with a separate column for GPA can also be used for one-way ANOVA; and (c) the 0/1 indicator column and stacked GPAs can be used with correlation and regression. Every procedure gives the same p-value and same conclusion, thus highlighting the similarity of statistical procedures and StatCrunch's flexibility.
bartonpoulsonApr 7, 2010465B2510
Home Runs and Strike Outs for 2004 Boston Red Sox by Handedness
These data show home runs and strike outs for the 12 players from the Boston Red Sox who had more than 200 at-bats in the 2004 season (the first year they won the World Series after the 86-year Curse of the Bambino). It also shows whether the players bat left-handed or as switch hitters, both of which are coded as 0/1 (No/Yes, respectively) indicator variables (also known as dummy variables), as well as a text L/R/LR variable. These data were used for a demonstration for bivariate and multiple regression.
bartonpoulsonNov 3, 2009375B1278
Home Runs 2016
Data on all home runs hit during the 2016 baseball season. If the home run flew uninterrupted all the way back to field level, the actual distance the ball traveled from home plate, in feet. If the ball's flight was interrupted before returning all the way down to field level (as is usually the case), the estimated distance the ball would have traveled if its flight had continued uninterrupted all the way down to field level. Horiz. Angle - the initial direction of the ball as it left the bat in degrees, where 45 degrees is straight down the right field line, 90 degrees is straight over second base and 135 degrees is straight down the left field line. Apex - the highest point reached by the ball in flight above field level, in feet. Three types of home runs: "Just Enough" or "JE", which means the ball cleared the fence by less than 10 vertical feet, OR that it landed less than one fence height past the fence. These are the ones that barely made it over the fence... - "No Doubt", or "ND", which means the ball cleared the fence by at least 20 vertical feet AND landed at least 50 feet past the fence. These are the really deep blasts... - "Plenty", or "PL", which is everything else. Source: http://www.hittrackeronline.com/index.php
mcack1Feb 7, 2017566KB1271

1 2 3 4 5 6 7 8 9 10   >

Always Learning