StatCrunch logo (home)

Data sets shared by StatCrunch members
Showing 1 to 15 of 168 data sets matching ANALYSIS
Data Set/Description Owner Last edited Size Views
Module 3 - Data Set for Descriptive Analysis - FALL 2019.xlsx
PC 702 Module 3 assignment due 10/28/2019
lara.gabriel01Oct 21, 20191KB69
Responses to Sleep survey
Respondents were asked to provide their typical bed time and the average number of hours they sleep for both weekdays and weekends. Respondents also provided their age. Try using the following new ordering (Edit > Orderings) for the bed time variables so they will be listed in the proper order in your analysis results:
scsurveyJun 2, 20119KB5029
Alcohol data from adults
My group and I design a survey to find out among the adult who drinks , why they drink, their age, education level and how many drink they have per day. The data was gathered individually and put together into statcrunch by one member of the group. This survey shows the number of drinking adults and what motivate them to drink. Our survey question is below. 1. Do you Drink Alcohol? Circle one: Y N 2. What is your age?____years 3. What is your gender? Circle one: Male Female 4. Are you having an increasing number of A. Financial problems B. family problems C. Work problems D. Health problems E. Financial and family problems F financial, health and family problems G. Family and work problems H. Financial, Family, and work problems I. none of the above Circle one. 5. How many drinks do you have a week?_____ drinks 6. Education: What is the highest degree or level of school you have completed? If currently enrolled, mark the previous grade or highest degree received. A. No schooling completed B. Nursery school to 8th grade C. 9th, 10th or 11th grade D. 12th grade, no diploma E. High school graduate - high school diploma or the equivalent (for example: GED) F. Some college credit, but less than 1 year G. 1 or more years of college, no degree H. Associate degree (for example: AA, AS) I. Bachelor's degree (for example: BA, AB, BS) J. Master's degree (for example: MA, MS, MEng, MEd, MSW, MBA) K. Professional degree (for example: MD, DDS, DVM, LLB, JD) Circle one. ----- Original Message ---- Sent on:Tuesday, May 22, 2012 11:46 PM Hi. It looks good. Change: 2. What is your gender? Circle one: Male Female Other to2. What is your gender? Circle one: Male Female Other Since I do not think you will get someone answering as Other. In #3, I forgot another option:3. Are you having an increasing number of A. Financial problems B. family problems C. Work problems D. Financial and family problems E. financial and family problems F. Family and work problems G. Financial, Family, and work problems H. none of the above Circle one.
rosesegeJun 21, 20129KB5437
Roller Coasters
The maximum drop of fifty-five roller coasters in the United States. This dataset is described in Chapter 1 of Navigating Through Data Analysis in Grades 6-8.
bayesballMar 30, 2008283B1222
This is a small data set used to illustrate the failure of an inappropriate use of an independent means test compared with a paired test on the same data. The story is that we have before and after weights for 6 customers of a weight loss clinic. Visual observation makes it clear that the clinic is effective (except in one questionable case). Students can discuss what sources there are for the variation found in the data set and relate them to the assumptions of the independent versus paired analysis models. Application of classical techniques will produce an extremely large p_value for the independent analysis and a significant p_value for the paired analysis. To illustrate the difference with simulation techniques, first do a randomization for two means between the before and after data groups. This will spectacularly fail to show a difference, when in fact there is a clear difference. Then use a bootstrap to examine the 6 differences and it is clear that a zero difference is highly unlikely.
david.zeitlerMay 19, 201187B1954
North Carolina Pick 4 Results
Daily daytime/evening results for the North Carolina Pick 4 lottery from January 2012 through September 2014. Each time the game is played four numbers between 0 and 9 are selected with replacement. Each sequence of four numbers is stored in the Numbers column with a hyphen separator. Try using the Data > Arrange > Slice menu option with the Numbers column and a hyphen delimiter to break the individual numbers out into four separate columns. Then stack the four columns using the Data > Arrange > Stack menu option to get all of the results into a single column for analysis.
statcrunchhelpOct 22, 201484KB873
50 States data
Data from government sites: U.S. Census Bureau for population, and U.S. Bureau of Economic Analysis for personal income
phil_larsonAug 28, 20132KB2054
Breast Cancer
Datafile Name: Breast Cancer Datafile Subjects: Health , Medical Story Names: Breast cancer Reference: A.J. Lea. (1965). New Observations on Distribution of Neoplasms of Female Breast in Certain Countries. British Medical Journal, 1, 488-490. Text Citation: Velleman, P. F. and Hoaglin, D. C. (1981). Applications, Basics, and Computing of Exploratory Data Analysis. Belmont. CA: Wadsworth, Inc., pp. 127-134. Authorization: free use Description: Data contains the mean annual temperature (in degrees F) and Mortality Index for neoplasms of the female breast. Data were taken from certain regions of Great Britain, Norway, and Sweden. Number of cases: 16 Variable Names: Mortality: Mortality index for neoplasms of the female breast Temperature: Mean annual temperature (in degrees F) In the early 1960s, data were collected from official statistics registers of Great Britain, Norway and Sweden on breast cancer mortality. Death rates for neoplasms of the breast were calculated for various age groups and for certain areas at the same latitude. Age-specific death rates were then calculated for each area and converted to a mortality index using 100 as the age-specific rate for all of England and Wales. The mean annual temperatures at various latitudes under study were obtained from the British Meteorological Office.
phil_larsonDec 2, 2015187B2322
LAX-JFK_AA & UA flights 6-2012
Data set for population t-Test project BACKGROUND The data contained in the compressed file has been extracted from the On-Time Performance data table of the "On-Time" database from the TranStats data library. The time period is indicated in the name of the compressed file; for example, XXX_XXXXX_2001_1 contains data of the first month of the year 2001. RECORD LAYOUT Below are fields in the order that they appear on the records: Year Year Quarter Quarter (1-4) Month Month DayofMonth Day of Month DayOfWeek Day of Week FlightDate Flight Date (yyyymmdd) UniqueCarrier Unique Carrier Code. When the same code has been used by multiple carriers, a numeric suffix is used for earlier users, for example, PA, PA(1), PA(2). Use this field for analysis across a range of years. AirlineID An identification number assigned by US DOT to identify a unique airline (carrier). A unique airline (carrier) is defined as one holding and reporting under the same DOT certificate regardless of its Code, Name, or holding company/corporation. Carrier Code assigned by IATA and commonly used to identify a carrier. As the same code may have been assigned to different carriers over time, the code is not always unique. For analysis, use the Unique Carrier Code. TailNum Tail Number FlightNum Flight Number OriginCityName Origin Airport, City Name DestCityName Destination Airport, City Name CRSDepTime CRS Departure Time (local time: hhmm) DepTime Actual Departure Time (local time: hhmm) DepDelay Difference in minutes between scheduled and actual departure time. Early departures show negative numbers. DepDelayMinutes Difference in minutes between scheduled and actual departure time. Early departures set to 0. DepDel15 Departure Delay Indicator, 15 Minutes or More (1=Yes) DepartureDelayGroups Departure Delay intervals, every (15 minutes from <-15 to >180) DepTimeBlk CRS Departure Time Block, Hourly Intervals TaxiOut Taxi Out Time, in Minutes WheelsOff Wheels Off Time (local time: hhmm) WheelsOn Wheels On Time (local time: hhmm) TaxiIn Taxi In Time, in Minutes CRSArrTime CRS Arrival Time (local time: hhmm) ArrTime Actual Arrival Time (local time: hhmm) ArrDelay Difference in minutes between scheduled and actual arrival time. Early arrivals show negative numbers. ArrDelayMinutes Difference in minutes between scheduled and actual arrival time. Early arrivals set to 0. ArrDel15 Arrival Delay Indicator, 15 Minutes or More (1=Yes) ArrivalDelayGroups Arrival Delay intervals, every (15-minutes from <-15 to >180) ArrTimeBlk CRS Arrival Time Block, Hourly Intervals CRSElapsedTime CRS Elapsed Time of Flight, in Minutes ActualElapsedTime Elapsed Time of Flight, in Minutes AirTime Flight Time, in Minutes Flights Number of Flights Distance Distance between airports (miles) CarrierDelay Carrier Delay, in Minutes WeatherDelay Weather Delay, in Minutes NASDelay National Air System Delay, in Minutes SecurityDelay Security Delay, in Minutes LateAircraftDelay Late Aircraft Delay, in Minutes
skyviewflierOct 6, 201283KB769
Flu Vaccine Survey:part 2 assignment:Confidence Intervals
Project for Statistics MAT 215 with Professor Racquet. Analysis of Flu Vaccine Survey second project step with confidence intervals. Christine
catlvNov 17, 20145KB847
Impaired Driving Death Rate by Age and Gender 2012 to 2014 All States
Rate of deaths by age/gender (per 100,000 population) for people killed in crashes involving a driver with BAC =>0.08%, 2012. 2012 Source: Fatality Analysis Reporting System (FARS)Note: Blank cells indicate data are suppressed. 2014 Source: Source: National Highway Traffic Administration's (NHTSA) Fatality Analysis Reporting System (FARS), 2014 Annual Report File. Fatality rates based on fewer than 20 deaths are suppressed.
lmcmath34Aug 19, 20196KB292
J Pribe auto-mpg.xlsx
The data set covers 12 years of vehicles and contains 398 individual entries. The data describes popular consumer vehicle’s miles per gallon (MPG), the number of engine cylinders, total engine size (displacement), engine horsepower, the vehicle weight, a measure of acceleration (0-60 MPH time), the model year of the vehicle (1970-1982), a coded identifier for the place of origin, and the make and model of the vehicle. MPG, number of cylinders, engine size, horsepower, weight, acceleration time, and model year are all numerical values. The vehicle origin, full name, make, and model are categorical. This data was chosen to meet the assignment requirements, and because cars are cool. *Origin data code: 1=USA, 2=Europe, 3=Japan. The "car name" variable was broken into additional make and model variables to ease analysis, a change from the original data set.
jpribeFeb 16, 201932KB505
Q1. Based on a recent study, roughly 80% of college students in the U.S. own a cell phone. Do the data provide evidence that the proportion of students who own cell phones in this university is lower than the national figure? Answer. Most likely not. Ownership of cellphones and ratios do not depend on anything. Relevant Variables - The cell is the relevant variable and it is categorical. Analyze Data - The formal analysis of Q1 will pinpoint on searching the population proportion. The correct statistical test is the one sample z-test for the proportion. Null Hypothesis - Ho: p = 8 Alternative Hypothesis - Ha: p < .8 Outcomes: Cell Success: yes Test stat z = -.71, p-value is .239 > .05, so Ho cannot be rejected. Roughly 78% of the students sampled own a cellphone. Even though 78% percent is less than 80%, there is not enough support to conclude that the exact data holds right for the whole college or that it would be different from the national proportion.
faithnwanneMay 3, 20198KB549
ECO252: Unstacked Data for In-Class Analysis of Beers Brands.xlsx
Characteristics of beer brands, national versus regional brands
keith_coxFeb 4, 20138KB606
Dataset: airline_costs.dat Source: J.W. Proctor and J.S. Duncan (1954). "A Regression Analysis of Airline Costs," Journal of Air Law and Commerce, Vol.21, #3, pp.282-292. Description: Regression relating Operating Costs per revenue ton-mile to 7 factors: length of flight, speed of plane, daily flight time per aircraft, population served, ton-mile load factor, available tons per aircraft mile, and firms net assets. Regression based on natural logarithms of all factors, except load factor. Load factor and available tons (capacity) for Northeast Airlines was imputed from summary calculations. Variables/columns Airline 1-20 Length of flight (miles) 22-28 L_Group (inserted) Long (>175), Med (>60), Short (<69) Speed of Plane (miles per hour) 30-36 Daily Flight Time per plane (hours) 38-44 Population served (1000s) 46-52 Total Operating Cost (cents per revenue ton-mile) 54-60 Revenue Tons per Aircraft mile 62-68 Ton-Mile load factor (proportion) 70-76 Available Capacity (Tons per mile) 78-84 Total Assets ($100,000s) 86-92 Investments and Special Funds ($100,000s) 94-100 Adjusted Assets ($100,000s) 102-108
housew1Jul 3, 20192KB82

1 2 3 4 5 6 7 8 9 10   >

Always Learning