StatCrunch logo (home)

Data sets shared by StatCrunch members
Showing 1 to 15 of 151 data sets matching ANALYSIS
Data Set/Description Owner Last edited Size Views
J Pribe auto-mpg.xlsx
The data set covers 12 years of vehicles and contains 398 individual entries. The data describes popular consumer vehicle’s miles per gallon (MPG), the number of engine cylinders, total engine size (displacement), engine horsepower, the vehicle weight, a measure of acceleration (0-60 MPH time), the model year of the vehicle (1970-1982), a coded identifier for the place of origin, and the make and model of the vehicle. MPG, number of cylinders, engine size, horsepower, weight, acceleration time, and model year are all numerical values. The vehicle origin, full name, make, and model are categorical. This data was chosen to meet the assignment requirements, and because cars are cool. *Origin data code: 1=USA, 2=Europe, 3=Japan. The "car name" variable was broken into additional make and model variables to ease analysis, a change from the original data set.
jpribeFeb 16, 201932KB256
Responses to Sleep survey
Respondents were asked to provide their typical bed time and the average number of hours they sleep for both weekdays and weekends. Respondents also provided their age. Try using the following new ordering (Edit > Orderings) for the bed time variables so they will be listed in the proper order in your analysis results:
scsurveyJun 2, 20119KB4320
Alcohol data from adults
My group and I design a survey to find out among the adult who drinks , why they drink, their age, education level and how many drink they have per day. The data was gathered individually and put together into statcrunch by one member of the group. This survey shows the number of drinking adults and what motivate them to drink. Our survey question is below. 1. Do you Drink Alcohol? Circle one: Y N 2. What is your age?____years 3. What is your gender? Circle one: Male Female 4. Are you having an increasing number of A. Financial problems B. family problems C. Work problems D. Health problems E. Financial and family problems F financial, health and family problems G. Family and work problems H. Financial, Family, and work problems I. none of the above Circle one. 5. How many drinks do you have a week?_____ drinks 6. Education: What is the highest degree or level of school you have completed? If currently enrolled, mark the previous grade or highest degree received. A. No schooling completed B. Nursery school to 8th grade C. 9th, 10th or 11th grade D. 12th grade, no diploma E. High school graduate - high school diploma or the equivalent (for example: GED) F. Some college credit, but less than 1 year G. 1 or more years of college, no degree H. Associate degree (for example: AA, AS) I. Bachelor's degree (for example: BA, AB, BS) J. Master's degree (for example: MA, MS, MEng, MEd, MSW, MBA) K. Professional degree (for example: MD, DDS, DVM, LLB, JD) Circle one. ----- Original Message ---- Sent on:Tuesday, May 22, 2012 11:46 PM Hi. It looks good. Change: 2. What is your gender? Circle one: Male Female Other to2. What is your gender? Circle one: Male Female Other Since I do not think you will get someone answering as Other. In #3, I forgot another option:3. Are you having an increasing number of A. Financial problems B. family problems C. Work problems D. Financial and family problems E. financial and family problems F. Family and work problems G. Financial, Family, and work problems H. none of the above Circle one.
rosesegeJun 21, 20129KB5037
Roller Coasters
The maximum drop of fifty-five roller coasters in the United States. This dataset is described in Chapter 1 of Navigating Through Data Analysis in Grades 6-8.
bayesballMar 30, 2008283B1090
This is a small data set used to illustrate the failure of an inappropriate use of an independent means test compared with a paired test on the same data. The story is that we have before and after weights for 6 customers of a weight loss clinic. Visual observation makes it clear that the clinic is effective (except in one questionable case). Students can discuss what sources there are for the variation found in the data set and relate them to the assumptions of the independent versus paired analysis models. Application of classical techniques will produce an extremely large p_value for the independent analysis and a significant p_value for the paired analysis. To illustrate the difference with simulation techniques, first do a randomization for two means between the before and after data groups. This will spectacularly fail to show a difference, when in fact there is a clear difference. Then use a bootstrap to examine the 6 differences and it is clear that a zero difference is highly unlikely.
david.zeitlerMay 19, 201187B1822
North Carolina Pick 4 Results
Daily daytime/evening results for the North Carolina Pick 4 lottery from January 2012 through September 2014. Each time the game is played four numbers between 0 and 9 are selected with replacement. Each sequence of four numbers is stored in the Numbers column with a hyphen separator. Try using the Data > Arrange > Slice menu option with the Numbers column and a hyphen delimiter to break the individual numbers out into four separate columns. Then stack the four columns using the Data > Arrange > Stack menu option to get all of the results into a single column for analysis.
statcrunchhelpOct 22, 201484KB856
50 States data
Data from government sites: U.S. Census Bureau for population, and U.S. Bureau of Economic Analysis for personal income
phil_larsonAug 28, 20132KB1977
Breast Cancer
Datafile Name: Breast Cancer Datafile Subjects: Health , Medical Story Names: Breast cancer Reference: A.J. Lea. (1965). New Observations on Distribution of Neoplasms of Female Breast in Certain Countries. British Medical Journal, 1, 488-490. Text Citation: Velleman, P. F. and Hoaglin, D. C. (1981). Applications, Basics, and Computing of Exploratory Data Analysis. Belmont. CA: Wadsworth, Inc., pp. 127-134. Authorization: free use Description: Data contains the mean annual temperature (in degrees F) and Mortality Index for neoplasms of the female breast. Data were taken from certain regions of Great Britain, Norway, and Sweden. Number of cases: 16 Variable Names: Mortality: Mortality index for neoplasms of the female breast Temperature: Mean annual temperature (in degrees F) In the early 1960s, data were collected from official statistics registers of Great Britain, Norway and Sweden on breast cancer mortality. Death rates for neoplasms of the breast were calculated for various age groups and for certain areas at the same latitude. Age-specific death rates were then calculated for each area and converted to a mortality index using 100 as the age-specific rate for all of England and Wales. The mean annual temperatures at various latitudes under study were obtained from the British Meteorological Office.
phil_larsonDec 2, 2015187B2219
LAX-JFK_AA & UA flights 6-2012
Data set for population t-Test project BACKGROUND The data contained in the compressed file has been extracted from the On-Time Performance data table of the "On-Time" database from the TranStats data library. The time period is indicated in the name of the compressed file; for example, XXX_XXXXX_2001_1 contains data of the first month of the year 2001. RECORD LAYOUT Below are fields in the order that they appear on the records: Year Year Quarter Quarter (1-4) Month Month DayofMonth Day of Month DayOfWeek Day of Week FlightDate Flight Date (yyyymmdd) UniqueCarrier Unique Carrier Code. When the same code has been used by multiple carriers, a numeric suffix is used for earlier users, for example, PA, PA(1), PA(2). Use this field for analysis across a range of years. AirlineID An identification number assigned by US DOT to identify a unique airline (carrier). A unique airline (carrier) is defined as one holding and reporting under the same DOT certificate regardless of its Code, Name, or holding company/corporation. Carrier Code assigned by IATA and commonly used to identify a carrier. As the same code may have been assigned to different carriers over time, the code is not always unique. For analysis, use the Unique Carrier Code. TailNum Tail Number FlightNum Flight Number OriginCityName Origin Airport, City Name DestCityName Destination Airport, City Name CRSDepTime CRS Departure Time (local time: hhmm) DepTime Actual Departure Time (local time: hhmm) DepDelay Difference in minutes between scheduled and actual departure time. Early departures show negative numbers. DepDelayMinutes Difference in minutes between scheduled and actual departure time. Early departures set to 0. DepDel15 Departure Delay Indicator, 15 Minutes or More (1=Yes) DepartureDelayGroups Departure Delay intervals, every (15 minutes from <-15 to >180) DepTimeBlk CRS Departure Time Block, Hourly Intervals TaxiOut Taxi Out Time, in Minutes WheelsOff Wheels Off Time (local time: hhmm) WheelsOn Wheels On Time (local time: hhmm) TaxiIn Taxi In Time, in Minutes CRSArrTime CRS Arrival Time (local time: hhmm) ArrTime Actual Arrival Time (local time: hhmm) ArrDelay Difference in minutes between scheduled and actual arrival time. Early arrivals show negative numbers. ArrDelayMinutes Difference in minutes between scheduled and actual arrival time. Early arrivals set to 0. ArrDel15 Arrival Delay Indicator, 15 Minutes or More (1=Yes) ArrivalDelayGroups Arrival Delay intervals, every (15-minutes from <-15 to >180) ArrTimeBlk CRS Arrival Time Block, Hourly Intervals CRSElapsedTime CRS Elapsed Time of Flight, in Minutes ActualElapsedTime Elapsed Time of Flight, in Minutes AirTime Flight Time, in Minutes Flights Number of Flights Distance Distance between airports (miles) CarrierDelay Carrier Delay, in Minutes WeatherDelay Weather Delay, in Minutes NASDelay National Air System Delay, in Minutes SecurityDelay Security Delay, in Minutes LateAircraftDelay Late Aircraft Delay, in Minutes
skyviewflierOct 6, 201283KB744
Flu Vaccine Survey:part 2 assignment:Confidence Intervals
Project for Statistics MAT 215 with Professor Racquet. Analysis of Flu Vaccine Survey second project step with confidence intervals. Christine
catlvNov 17, 20145KB819
Carbon Dioxide Emissions from Fossil Fuel Burning in Top Ten Countries, 1950-2013
Notes: Data exclude emissions from cement production and gas flaring. Emissions figures are in million tons of carbon. For tons of CO2, multiply by 44/12. This data was collected through various analysis and statistical centers and then compiled together for side by side scrutiny.
mortensennNov 20, 2018989B210
ECO252: Unstacked Data for In-Class Analysis of Beers Brands.xlsx
Characteristics of beer brands, national versus regional brands
keith_coxFeb 4, 20138KB586
Maria's Data Analysis
Maria's Classroom data to be used for the Sampling Variability Project
ninibb1Jun 14, 20162KB352
These data are comprised of the 200 most recent tweets from MSNBC and FoxNews (as of August 18, 2016 at 4pm EST). These data will be used in sections of STAT 211 at WVU during a graded group based lab. This lab will introduce the concept of text analysis and will emphasize the area principle for graphs (in the context of word walls of these tweet data) as well as the interactive graphical capabilities of StatCrunch.
kjryanAug 18, 201656KB320
Analysis of Mean and Variation - How far from School is your Home
To demonstrate the role of outliers in changing statistics
jricoiiiJan 26, 2017750B203

1 2 3 4 5 6 7 8 9 10   >

Always Learning