StatCrunch logo (home)

Data sets shared by StatCrunch members
Showing 1 to 15 of 369 data sets matching VARIABLE
Data Set/Description Owner Last edited Size Views
J Pribe auto-mpg.xlsx
The data set covers 12 years of vehicles and contains 398 individual entries. The data describes popular consumer vehicle’s miles per gallon (MPG), the number of engine cylinders, total engine size (displacement), engine horsepower, the vehicle weight, a measure of acceleration (0-60 MPH time), the model year of the vehicle (1970-1982), a coded identifier for the place of origin, and the make and model of the vehicle. MPG, number of cylinders, engine size, horsepower, weight, acceleration time, and model year are all numerical values. The vehicle origin, full name, make, and model are categorical. This data was chosen to meet the assignment requirements, and because cars are cool. *Origin data code: 1=USA, 2=Europe, 3=Japan. The "car name" variable was broken into additional make and model variables to ease analysis, a change from the original data set.
jpribeFeb 16, 201932KB112
2018 Vehicles Dataset
This data is taken from the website https://www.fueleconomy.gov/feg/download.shtml. It's vehicles that were made in 2018. The variables include FE Rating, Tailpipe CO2 emissions, annual fuel cost, etc. Both categorical and quantitative variables are present.
habarkerFeb 12, 2019133KB296
The Unofficial 2014 NFL Player Census - S. Lohse
This data set is a shared data set owned by websterwest and last updated May 2015. It is being used for examples in my classroom. Original Description: This data set contains a number of variables on every NFL player participating in the 2014 season. Most of the variables should be self explanatory. Salary represents the average annual salary for the player under their existing contract. Exp represents years of experience. Pro Bowler represents the number of years the player was selected for the pro bowl. Champ provides the number of championship teams on which the player has played. Heisman represents whether or not the player won the Heisman trophy in college.
slohse9395Feb 11, 2019321KB112
Major League Players Elected to Hall of Fame as Players
Includes 2019 BBWAA-elected inductees Mariano Rivera, Edgar Martinez, Roy Halladay, and Mike Mussina. 31 variables for each player. Team=primary team; BBWAA=Baseball Writers Association of America; Bat: R=right, L=left, B=both; WAR=Wins Against Replacement: number of wins the player added to the team above what an "average" replacement player would add. CS=caught stealing. OPS=On-base Plus Slugging; as a rule of thumb, a "good" OPS is a value that when divided by 3 results in a value that would be considered a "good" batting average. Other variables are hopefully self-explanatory.
treilandJan 25, 201937KB5485
Criminal Recidivism in Iowa: 2010-2014
Recidivism is defined as the "tendency of a convicted criminal to reoffend". This dataset tracks former criminals from Iowa over a 3 year period after their release from prison to see whether or not they were convicted of a new crime during that time. The recidivism reporting year is the fiscal year (year ending June 30) marking the end of the three year tracking period. Included are the following variables: Fiscal Year Released (the year the individual was released from Prison), the Race, Ethnicity, Sex, and Age of individual when released. Also included are details about the original crime committed along with whether that individual committed a new crime (Recidivism - Return to Prison) within the 3 year window.
statcrunch_featuredMar 21, 20183MB2994
USA Car Accidents in 2011
This data set contains information for drivers involved in car accidents in the United States during 2011. The variables include the age in years of the person (Age), the gender of the person (Gender), the month in which the accident occurred (Month), and the day of the week of the accident (DayOfWeek).
statcrunch_featuredSep 12, 2017919KB8357
Cereal Brands
Data on several variable of different brands of cereal. Number of cases: 77 Variable Names: Name: Name of cereal mfr: Manufacturer of cereal where A = American Home Food Products; G = General Mills; K = Kelloggs; N = Nabisco; P = Post; Q = Quaker Oats; R = Ralston Purina type: cold or hot calories: calories per serving protein: grams of protein fat: grams of fat sodium: milligrams of sodium fiber: grams of dietary fiber carbo: grams of complex carbohydrates sugars: grams of sugars potass: milligrams of potassium vitamins: vitamins and minerals - 0, 25, or 100, indicating the typical percentage of FDA recommended shelf: display shelf (1, 2, or 3, counting from the floor) weight: weight in ounces of one serving cups: number of cups in one serving rating: a rating of the cereals
statcrunch_featuredApr 3, 20174KB6481
Attendance Vs. Grade
Compares percent of classes attended with final grade in the class. If you use % missed as the independent variable, you end up with a regression model that allows for interpretation of the intercept and has a negative slope.
lbgreenJan 28, 2019744B415
Class Seating vs Grade
From Body Image Data Set: "A student survey was conducted at a major university. Data were collected from a random sample of 239 undergraduate students". Variables: Gender - Male or Female, GPA - Student's cumulative college GPA. GPA is then converted to Grades (where, 4.33 = A+, 4.00 = A, 3.67 = A-, 3.33 = B+, 3.00 = B, 2.67 = B-, 2.33 = C+, 2.00 = C, 1.67 = C-). Seat - Typical classroom seat location (Front & Back)
mallirhea86Oct 26, 20182KB2648
Galton Data
The table below gives data based on the famous 1885 study of Francis Galton exploring the relationship between the heights of adult children and the heights of their parents. Each case is an adult child, and the variables are Family: The family that the child belongs to, labeled from 1 to 204 and 136A Father: The father's height, in inches Mother: The mother's height, in inches Gender: The gender of the child, male (M) or female (F) Height: The height of the child, in inches Kids: The number of kids in the family of the child
msullivan13803Dec 13, 201824KB381
US Counties and Presidential Voting Dataset
Sampling Unit county 3141 observations and 19 variables, maximum # NAs:2956 Name county -- County state -- State msa -- Metropolitan Statistical Area pmsa -- Primary Metropolitan Statistical Area pop.density -- 1992 pop per 1990 miles^2 pop -- 1990 population pop.change -- Percent population change 1980-1992 age6574 -- Percent age 65-74, 1990 age75 -- Percent age >= 75, 1990 crime -- serious crimes per 100,000 1991 college -- Percent with bachelor's degree or higher of those age>=25 income -- median family income, 1989 dollars farm -- farm population, % of total, 1990 democrat -- Percent votes cast for democratic president republican -- Percent votes cast for republican president Perot -- Percent votes cast for Ross Perot white -- Percent white, 1990 black -- Percent black, 1990 turnout -- 1992 votes for president / 1990 pop x 100
craig_slinkmanApr 12, 2011755KB2136
PVG - MAT240 Weather Data Set in StatCrunch for DB1
This Weather dataset is for DB1. The variables are year, month, TPCP.
194f78df-c24f-4be3-a7da-58195b3924fd-25367_d2l_snhumlpDec 27, 20187KB57
Nfl draft combine results 1999-2013
The NFL Combine occurs once per year and is used to measure the physical characteristics of potential NFL draft picks. The data covers 1999-2013. Variables include college, position, height, weight, 40 yard dash time, etc.
daniel.inghramFeb 14, 2014324KB2135
The Unofficial 2014 NFL Player Census
This data set contains a number of variables on every NFL player participating in the 2014 season. Most of the variables should be self explanatory. Salary represents the average annual salary for the player under their existing contract. Exp represents years of experience. Pro Bowler represents the number of years the player was selected for the pro bowl. Champ provides the number of championship teams on which the player has played. Heisman represents whether or not the player won the Heisman trophy in college.
websterwestMay 5, 2015321KB1890
Skyscrapers in the U.S.
Data for buildings in the United States that are 100 meters tall or higher. The variables include the rank in terms of height (Rank), the building name (Building), the height in meters (Height), the number of floors (Floors), the year of completion (Year), materials used in construction (Materials), and the use of the building (Use). The last two variables contain multiple outcomes delimited by /. When considering these columns, consider an outcomes table (Stat > Tables > Outcomes) with / as a delimiter.
websterwestJan 14, 2015148KB2369

1 2 3 4 5 6 7 8 9 10   >

Always Learning