Data Set/Description 
Owner 
Last edited 
Size 
Views 
All MLB Salaries (19852015)
This data has all MLB player salaries between 19852015 including the team played for, the city, and a unique ID for each player. Total this includes 25,575 salaries for 4,963 different baseball players.
The player ID is the first 5 letters from the last name, followed by the first two letters from the first name, followed by a number in case of duplicate names. For example, bondsba01 stands for Barry Bonds with "01" because he's the first with the "bondsba" name ID.  statcrunch_featured  Jun 27, 2017  1MB  1516 
2014 MLB Top 100 Batters
This data came from ESPN.com and has the top 100 batters by WAR (wins above replacement).
AB: At bats
R: Runs
H: Hits
2B: Doubles
3B: Triples
RBI: Runs batted in
SB: Stolen Bases
BB: Walks
SO: Strikeouts
AVG: Batting average
OBP: On Base Percentage
SLG: Slugging Percentage
OPS: OBP + SLG
WAR: Wins Above Replacement  statcrunch_featured  Apr 3, 2017  9KB  686 
Major League Players Elected to Hall of Fame as Players
Includes 2017 inductees Jeff Bagwell, Tim Raines, and Ivan Rodriguez. 31 variables for each player. Team=primary team; BBWAA=Baseball Writers Association of America; Bat: R=right, L=left, B=both;
WAR=Wins Against Replacement: number of wins the player added to the team above what an "average" replacement player would add.
CS=caught stealing.
OPS=Onbase Plus Slugging; as a rule of thumb, a "good" OPS is a value that when divided by 3 results in a value that would be considered a "good" batting average.
Other variables are hopefully selfexplanatory.  treiland  Jun 5, 2017  34KB  3061 
2014 MLB Top 100 Batters
This data came from ESPN.com and has the top 100 batters by WAR (wins above replacement).
AB: At bats
R: Runs
H: Hits
2B: Doubles
3B: Triples
RBI: Runs batted in
SB: Stolen Bases
BB: Walks
SO: Strikeouts
AVG: Batting average
OBP: On Base Percentage
SLG: Slugging Percentage
OPS: OBP + SLG
WAR: Wins Above Replacement  ntorno8  Apr 6, 2015  9KB  1286 
Baseball2013.xlsx
Stats from the major league baseball teams for 2013. The last column I added denotes AL for American League and NL for National League. One could possibly conduct a twosample means test, for example, to find out whether the average runs for the two leagues are equal. Or there are of course lots of regressions one could run.  eykolo@stat.tamu.edu  Nov 4, 2013  3KB  1790 
MLB Home Attendance vs. Runs Scored 2015
This data comes from the 2015 baseball season and tracks the number of home games, the total attendance at home games, the number of runs scored by that team, the runs scored on that team, the league they play in, and the number of wins the team recorded in the regular season.  frompearsonbooks  Jun 14, 2016  1KB  1313 
All MLB Salaries (19852015)
This data has all MLB player salaries between 19852015 including the team played for, the city, and a unique ID for each player. Total this includes 25,575 salaries for 4,963 different baseball players.
The player ID is the first 5 letters from the last name, followed by the first two letters from the first name, followed by a number in case of duplicate names. For example, bondsba01 stands for Barry Bonds with "01" because he's the first with the "bondsba" name ID.  statcrunchhelp  Mar 15, 2016  1MB  1163 
2015 MLB Team Data
Team stats for MLB 2015 in early October; includes team opening salary, wins, losses, pitching, batting, fielding stats, playoff appearance, world series wins/losses (does not include 2015 WS winner)  je175  Jul 25, 2016  8KB  1034 
Home Runs and Strike Outs for 2004 Boston Red Sox by Handedness
These data show home runs and strike outs for the 12 players from the Boston Red Sox who had more than 200 atbats in the 2004 season (the first year they won the World Series after the 86year Curse of the Bambino). It also shows whether the players bat lefthanded or as switch hitters, both of which are coded as 0/1 (No/Yes, respectively) indicator variables (also known as dummy variables), as well as a text L/R/LR variable. These data were used for a demonstration for bivariate and multiple regression.  bartonpoulson  Nov 3, 2009  375B  1109 
nlbatting2009.txt
This dataset contains batting statistics for all National League teams in the 2009 baseball season. The goal of batting is to score runs and the dataset contains the number of runs scored per game. An interesting activity is find which offensive measures (batting average, OBP, SLG, OPS) are most helpful in predicting runs scored.  bayesball  Jun 8, 2010  958B  944 
Home Runs 2016
Data on all home runs hit during the 2016 baseball season. If the home run flew uninterrupted all the way back to field level, the actual distance the ball traveled from home plate, in feet. If the ball's flight was interrupted before returning all the way down to field level (as is usually the case), the estimated distance the ball would have traveled if its flight had continued uninterrupted all the way down to field level. Horiz. Angle  the initial direction of the ball as it left the bat in degrees, where 45 degrees is straight down the right field line, 90 degrees is straight over second base and 135 degrees is straight down the left field line. Apex  the highest point reached by the ball in flight above field level, in feet.
Three types of home runs: "Just Enough" or "JE", which means the ball cleared the fence by less than 10 vertical feet, OR that it landed less than one fence height past the fence. These are the ones that barely made it over the fence...
 "No Doubt", or "ND", which means the ball cleared the fence by at least 20 vertical feet AND landed at least 50 feet past the fence. These are the really deep blasts...
 "Plenty", or "PL", which is everything else.
Source: http://www.hittrackeronline.com/index.php  mcack1  Feb 7, 2017  566KB  891 
Baseball data for correlation and regression
This table shows the total number of runs scored, at bats, hits, etc for each of the 30 MLB teams for the 20092011 seasons.
////
Correlations and linear regression models can be calculated between the different numeric variables. A good exercise is to see which variables correlate most strongly with runs_scored.
////
As emphasized in the movie Moneyball, some of the classic metrics such as batting_avg is not as good as the newer metrics like OBP (on base percentage), SLG (slugging percentage), or OPS (on base plus slugging).
////
A guide to a few of the variables that may not be self explanatory.
Runs_Scored: The total of all runs (points) the baseball team scored by the end of the season.
Batting_avg: This is equal to the number of hits divided by at_bats
OBP: On Base Percentage. Similar to batting average, except that it takes into account walks and hitbypitch. Some players who don't have high batting averages, manage to get walked quite frequently.
SLG: Slugging  This weights hits to first base as 1 point, hits to second base as 2 points, third as 3, homeruns as 4, and divides the total by the number of at bats.
OPS  On Base Plus Slugging  this is just OBP added to the SLG numbers.  mileschen  Apr 17, 2012  6KB  2844 
Time of Baseball Game (Fall 2015)
This dataset has information the length of baseball games, along with characteristics of those games, from August 24 to August 31, during the 2015 Major League Baseball season.  jpalmateer@towson.edu  Sep 28, 2015  5KB  4512 
2014 MLB Top 100 Batters
This data came from ESPN.com and has the top 100 batters by WAR (wins above replacement). AB: At bats R: Runs H: Hits 2B: Doubles 3B: Triples RBI: Runs batted in SB: Stolen Bases BB: Walks SO: Strikeouts AVG: Batting average OBP: On Base Percentage SLG: Slugging Percentage OPS: OBP + SLG WAR: Wins Above Replacement  statcrunchhelp  Jan 5, 2016  9KB  740 
Home Runs 2016
Data on all home runs hit during the 2016 baseball season. If the home run flew uninterrupted all the way back to field level, the actual distance the ball traveled from home plate, in feet. If the ball's flight was interrupted before returning all the way down to field level (as is usually the case), the estimated distance the ball would have traveled if its flight had continued uninterrupted all the way down to field level. Horiz. Angle  the initial direction of the ball as it left the bat in degrees, where 45 degrees is straight down the right field line, 90 degrees is straight over second base and 135 degrees is straight down the left field line. Apex  the highest point reached by the ball in flight above field level, in feet.
Three types of home runs: "Just Enough" or "JE", which means the ball cleared the fence by less than 10 vertical feet, OR that it landed less than one fence height past the fence. These are the ones that barely made it over the fence...
 "No Doubt", or "ND", which means the ball cleared the fence by at least 20 vertical feet AND landed at least 50 feet past the fence. These are the really deep blasts...
 "Plenty", or "PL", which is everything else.
Source: http://www.hittrackeronline.com/index.php  msullivan13803  Nov 18, 2016  566KB  745 
