StatCrunch logo (home)

Data sets shared by StatCrunch members
Showing 1 to 15 of 6125 data sets matching XLS
Data Set/Description Owner Last edited Size Views
J Pribe auto-mpg.xlsx
The data set covers 12 years of vehicles and contains 398 individual entries. The data describes popular consumer vehicle’s miles per gallon (MPG), the number of engine cylinders, total engine size (displacement), engine horsepower, the vehicle weight, a measure of acceleration (0-60 MPH time), the model year of the vehicle (1970-1982), a coded identifier for the place of origin, and the make and model of the vehicle. MPG, number of cylinders, engine size, horsepower, weight, acceleration time, and model year are all numerical values. The vehicle origin, full name, make, and model are categorical. This data was chosen to meet the assignment requirements, and because cars are cool. *Origin data code: 1=USA, 2=Europe, 3=Japan. The "car name" variable was broken into additional make and model variables to ease analysis, a change from the original data set.
jpribeFeb 16, 201932KB107
PENNYAGES-n800.XLS
Year of 800 pennies from a local bank, sampled in 2011 (which is why frequency for 2011 is low).
anderson_instructorOct 29, 20184KB1408
Titanic.xlsx
Report on the Loss of the ‘Titanic’ (S.S.) (1990), British Board of Trade Inquiry Report (reprint), Gloucester, UK: Allan Sutton Publishing. Taken from the Journal on Statistical Education Archive, submitted by rdawson@husky1.stmarys.ca. Dr. Craig Slinkman has recoded the data as self-explanatory nominal variables. yes craig_slinkman Mar 23, 2010 68KB 5
craig_slinkmanMar 23, 201061KB2189
Mother and Daughter Heights.xls
This data set is Galton's Mother and Daughter data set as used in Sanfford Weisberg's Applied Linear Regression, 3rd Edition.
craig_slinkmanApr 10, 201013KB6299
Happiness Data from GSS.xls
These data come from the 2008 General Social Survey. A subset of 190 respondents were selected at random from the full data set. Children = number of children. Education is highest year of education (e.g., 12 = High School; 16 = Bachelors, etc.). Happy: 1 = Not too happy, 2 = Pretty Happy, 3 = Very Happy. Health: 1 = Poor, 2 = Fair, 3 = Good, 4 = Excellent. Income: 1 = Under $1000; 2 = $1000-2999; 3 = $3000-3999; 4 = $4000-4999; 5 = $5000-5999; 6 = $6000-6999; 7 = $7000-7999; 8 = $8000-9999; 9 = $10000-12499; 10 = $12500-14999; 11 = $15000-17499; 12 = $17500-19999; 13 = $20000-22499; 14 = $22500-24999; 15 = $25000-29999; 16 = $30000-34999; 17 = $35000-39999; 18 = $40000-49999; 19 = $50000-59999; 20 = $60000-74999; 21 = $75000-$89999; 22 = $90000-$109999; 23 = $110000-$129999; 24 = $130000-$149999; 25 = $150000+. Married: 0 = No, 1 = Yes. Religious: 1 = Not religious, 2 = Slightly religious, 3 = Moderately religious, 4 = Very religious.
jacobgsimonsApr 20, 20105KB3816
BODYMEAS.XLS
Random Sample of 100 observations from NHANES (which contains more observations). GENDER (1=Male, 2=Female), AGE (years), WEIGHTENG (inches), HEIGHTENG (inches), SIXFOOT (0=No, 1=Yes to being 72 inches or taller), LEGENG (Leg length inches), WAISTENG (Waist circumference inches), THIGHENG (Thigh circumference inches), WAIST28 (0=No, 1=Yes to having waist 28 inches or smaller), HEIGHT65 (0=No, 1=Yes to being 65 inches tall or shorter), BMI30 (0=No, 1=Yes to having Body Mass Index 30 or higher), OVER200 (0=No, 1=Yes to weighing 200 pounds or more).
jph422Sep 16, 20084KB3529
Baseball2013.xlsx
Stats from the major league baseball teams for 2013. The last column I added denotes AL for American League and NL for National League. One could possibly conduct a two-sample means test, for example, to find out whether the average runs for the two leagues are equal. Or there are of course lots of regressions one could run.
eykoloNov 4, 20133KB2005
oldfaith.xls
The data in the Old Faithful file gives data about eruptions of the Old Faithful Geyser during October 1980. Variables are Duration in seconds of the current eruption, and Interval, the time to the next eruption. Old Faithful is an important tourist attraction, with up to a thousand people watching it erupt on pleasant summer days. The National Park Service uses data to obtain a prediction of the time to the next eruption.
craig_slinkmanMay 4, 20102KB2442
State Population and Percent Changes 2015.xlsx
This data set is from the Census Bureau. It shows the population in 2010 as compared to estimates in 2015 for every state as well as Puerto Rico. The table also lists the percent change of population of each state as well as the rank of each state as it pertains to the most populous states.
hbarker2Feb 14, 20162KB1370
US Emissions of Greenhouse Gases Based on Global Warming Potential 1990-2007 Energy Information Administration.xls
U.S. Emissions of Greenhouse Gases, Based on Global Warming Potential, 1990-2007 Units are Million Metric Tons of Carbon Dioxide Equivalent Report #: DOE/EIA-0573(2007) Released Date: December 3, 2008   Next Release Date: November 2009 P = Preliminary Note: Data in this table are revised from the data contained in the previous EIA report, Emissions of Greenhouse Gases in the United States 2006, DOE/EIA-0573(2006) (Washington, DC, November 2007). Sources: Emissions of carbon dioxide, methane and nitrous oxide EIA. Emissions of HFCs, PFCs, and SF6, U.S. Environmental Protection Agency, preliminary data. Global Warming Potentials: United Nations, Intergovernmental Panel on Climate Change, Climate Change 2007 - The Physical Science Basis (Cambridge, UK: Cambridge University Press, 2007)
deathbysteveoMay 26, 2009813B2435
Diamond Ring Prices.xls
The source of the data is a full page advertisement placed in the Straits Times newspaper issue of February 29, 1992, by a Singapore-based retailer of diamond jewelry. The advertisement contained pictures of diamond rings and listed their prices, diamond content, and gold purity. Only 20K ladies' rings, each mounted with a single diamond stone, were considered for this study. 20K rings are made with gold of 20 carat purity. (Pure gold is rated as 24K.) There were 48 such rings of varying designs. The weights of the diamond stones ranged from 0.12 to 0.35 carats (a one carat diamond stone weighs 0.2 gram) and were priced between $223 and $1086. The jewelry store adopted a fixed-price policy. How Is Jewelry Priced? In Singapore, the pricing of gold jewelry is simple. The price equals the current market value of the gold content (i.e., weight times the going rate per gram of gold) plus a craftsmanship fee. However, the pricing of other jewelry like diamond rings is more complicated because they are not as standardized as gold jewelry. The price of diamond jewelry depends on the four C's: caratage, cut, colour, and clarity of the diamond stone. A good cut gives a diamond more sparkle. Colourless diamonds are the most prized. A flawless diamond has maximum clarity because the passage of light is unimpeded through the stone. Cut, colour, and clarity are subjective factors and are very hard for the layman to gauge.
craig_slinkmanApr 22, 2010586B2505
RegisteredNursesSurvey.xlsx
For what survey produced it, see http://www.statcrunch.com/5.0/survey.php?surveyid=8178&code=YINVQ and inputs of all team mates. Towards the end, some validation was done, deleting data where working hours was less than a work day, or outliers to legally admissible work days. Finally arbitarily long chains which were less likely to be encountered in draws of simulated data (M/F, Degrees etc.. were discarded). A total of 12 observations were thus thrown out. All Credit goes to Team 3,the Instructor, our unnamed Friends in the Nursing profession who enthusiastically did a last minute push through over their extended social media groups for data and the respondents who kindly took out time for the survey. Another thought is about the distribution of hours worked. Wven if random, it "should be" "centered on" certain hours a day* number of days, with deviations from centre penalised, while picking a sample.. The observations 38 appear many times for example, however without an explainable reason (we are talking of work-distribution among nursing staff sample) So do "primes" "47, 37, 29" It is not to argue that they "shouldn't occur", but there has to be some reason for their being so significant/vibrant. At this stage we may conclude that most of the respondents may not have been under full-time nursing employments in strict sense of the term. 42, 48,72,60, 50,40 appearing more often would give us less variation but more regularity in the data. Since we haven't tried stratification, we do not know "how often they should occur". We thus do not re-draw observations.
ugoagwuJun 14, 20142KB998
Weekly Gasoline Prices.xlsx
This time series data on the weekly price of regular gasoline in the state of Texas. Data consits of the data decomposed into year-month-day. The prices are givin in cents.
craig_slinkmanMar 25, 201012KB996
titanic_full.xls
VARIABLE DESCRIPTIONS: survival Survival (0 = No; 1 = Yes), pclass Passenger Class (1 = 1st; 2 = 2nd; 3 = 3rd), name Name, sex Sex, age Age, sibsp Number of Siblings/Spouses Aboard, parch Number of Parents/Children Aboard, ticket Ticket Number, fare Passenger Fare, cabin Cabin, embarked Port of Embarkation (C = Cherbourg; Q = Queenstown; S = Southampton), boat Lifeboat, body Body Identification Number home.dest Home/Destination.
swhardyOct 25, 2015110KB1304
WHO Health Data v4.xlsx
Country Country, Region WHO_region, AlcConsumption Total (recorded + unrecorded) adult (15+ years) per capita consumption projected estimates for 2008_2008 BAC_limit Blood Alcohol Concentration (BAC) limit for drivers - general - 2011, bednet Women that slept under a bednet last night (%), bednet_yr bednet Year, drinkWater_R Population using improved drinking-water sources (%)_Rural_2011, drinkWater_U Population using improved drinking-water sources (%)_Urban_2011, healthcenters Total density per 100 000 population: Health centres, healthposts Total density per 100 000 population: Health posts, Hiv_AidsDeaths Deaths due to HIV/AIDS (per 100 000 population)_2011, HivAdults Prevalence of HIV among adults aged 15 to 49 (%)_2011, hospital_yr hospital Year of data collection, hospitals Total density per 100 000 population: Hospitals, LifeExp_60_F Life expectancy at age 60 (years)_Female_2011, LifeExp_60_M Life expectancy at age 60 (years)_Male_2011, LifeExp_Birth_F Life expectancy at birth (years)_Female_2011, LifeExp_Birth_M Life expectancy at birth (years)_Male_2011, NumRegVehicles Number of Registered Vehicles Nursing_Midwives, Nursing_and_midwifery_personnel_density__per_1000_population_, Physicians Physicians_density__per_1000_population_, pollution Outdoor air pollution (Annual PM10 [ug/m3]), polYear Year, RegVehYear Year, rural_hosp Total density per 100 000 population: District/rural hospitals, sanFacility_R Population using improved sanitation facilities (%)_Rural_2011, sanFacility_U Population using improved sanitation facilities (%)_Urban_2011, seat_belt_drivers Seat-belt wearing rate (%) Driver only_2011, sex_work_syph Sex workers with active syphilis (%), sex_work_syph_yr sex_work_syph Year of data collection, spec_hosp Total density per 100 000 population: Specialized hospitals, Tobacco_S_F Current smoking of any tobacco product (age-standardized rate)_Female_2009, Tobacco_S_M Current smoking of any tobacco product (age-standardized rate)_Male_2009, Tobacco_Y_F Current users of any tobacco product (youth rate)_Female_2010, Tobacco_Y_M Current users of any tobacco product (youth rate)_Male_2010, TrafDeathRate Estimated road traffic death rate (per 100 000 population)_2010, TrafDeaths Estimated number of road traffic deaths _2010, UVradiation UV radiation_2004.
swhardyDec 6, 201531KB7877

1 2 3 4 5 6 7 8 9 10   >

Always Learning