StatCrunch logo (home)

Data sets shared by StatCrunch members
Showing 1 to 15 of 6416 data sets matching XLS
Data Set/Description Owner Last edited Size Views
PENNYAGES-n800.XLS
Year of 800 pennies from a local bank, sampled in 2011 (which is why frequency for 2011 is low).
anderson_instructorOct 29, 20184KB1479
Titanic.xlsx
Report on the Loss of the ‘Titanic’ (S.S.) (1990), British Board of Trade Inquiry Report (reprint), Gloucester, UK: Allan Sutton Publishing. Taken from the Journal on Statistical Education Archive, submitted by rdawson@husky1.stmarys.ca. Dr. Craig Slinkman has recoded the data as self-explanatory nominal variables. yes craig_slinkman Mar 23, 2010 68KB 5
craig_slinkmanMar 23, 201061KB2440
Mother and Daughter Heights.xls
This data set is Galton's Mother and Daughter data set as used in Sanfford Weisberg's Applied Linear Regression, 3rd Edition.
craig_slinkmanApr 10, 201013KB7144
Weekly Gasoline Prices.xlsx
This time series data on the weekly price of regular gasoline in the state of Texas. Data consits of the data decomposed into year-month-day. The prices are givin in cents.
craig_slinkmanMar 25, 201012KB1260
Happiness Data from GSS.xls
These data come from the 2008 General Social Survey. A subset of 190 respondents were selected at random from the full data set. Children = number of children. Education is highest year of education (e.g., 12 = High School; 16 = Bachelors, etc.). Happy: 1 = Not too happy, 2 = Pretty Happy, 3 = Very Happy. Health: 1 = Poor, 2 = Fair, 3 = Good, 4 = Excellent. Income: 1 = Under $1000; 2 = $1000-2999; 3 = $3000-3999; 4 = $4000-4999; 5 = $5000-5999; 6 = $6000-6999; 7 = $7000-7999; 8 = $8000-9999; 9 = $10000-12499; 10 = $12500-14999; 11 = $15000-17499; 12 = $17500-19999; 13 = $20000-22499; 14 = $22500-24999; 15 = $25000-29999; 16 = $30000-34999; 17 = $35000-39999; 18 = $40000-49999; 19 = $50000-59999; 20 = $60000-74999; 21 = $75000-$89999; 22 = $90000-$109999; 23 = $110000-$129999; 24 = $130000-$149999; 25 = $150000+. Married: 0 = No, 1 = Yes. Religious: 1 = Not religious, 2 = Slightly religious, 3 = Moderately religious, 4 = Very religious.
jacobgsimonsApr 20, 20105KB4138
BODYMEAS.XLS
Random Sample of 100 observations from NHANES (which contains more observations). GENDER (1=Male, 2=Female), AGE (years), WEIGHTENG (inches), HEIGHTENG (inches), SIXFOOT (0=No, 1=Yes to being 72 inches or taller), LEGENG (Leg length inches), WAISTENG (Waist circumference inches), THIGHENG (Thigh circumference inches), WAIST28 (0=No, 1=Yes to having waist 28 inches or smaller), HEIGHT65 (0=No, 1=Yes to being 65 inches tall or shorter), BMI30 (0=No, 1=Yes to having Body Mass Index 30 or higher), OVER200 (0=No, 1=Yes to weighing 200 pounds or more).
jph422Sep 16, 20084KB4012
Baseball2013.xlsx
Stats from the major league baseball teams for 2013. The last column I added denotes AL for American League and NL for National League. One could possibly conduct a two-sample means test, for example, to find out whether the average runs for the two leagues are equal. Or there are of course lots of regressions one could run.
eykoloNov 4, 20133KB2082
oldfaith.xls
The data in the Old Faithful file gives data about eruptions of the Old Faithful Geyser during October 1980. Variables are Duration in seconds of the current eruption, and Interval, the time to the next eruption. Old Faithful is an important tourist attraction, with up to a thousand people watching it erupt on pleasant summer days. The National Park Service uses data to obtain a prediction of the time to the next eruption.
craig_slinkmanMay 4, 20102KB2547
State Population and Percent Changes 2015.xlsx
This data set is from the Census Bureau. It shows the population in 2010 as compared to estimates in 2015 for every state as well as Puerto Rico. The table also lists the percent change of population of each state as well as the rank of each state as it pertains to the most populous states.
hbarker2Feb 14, 20162KB1426
RegisteredNursesSurvey.xlsx
For what survey produced it, see http://www.statcrunch.com/5.0/survey.php?surveyid=8178&code=YINVQ and inputs of all team mates. Towards the end, some validation was done, deleting data where working hours was less than a work day, or outliers to legally admissible work days. Finally arbitarily long chains which were less likely to be encountered in draws of simulated data (M/F, Degrees etc.. were discarded). A total of 12 observations were thus thrown out. All Credit goes to Team 3,the Instructor, our unnamed Friends in the Nursing profession who enthusiastically did a last minute push through over their extended social media groups for data and the respondents who kindly took out time for the survey. Another thought is about the distribution of hours worked. Wven if random, it "should be" "centered on" certain hours a day* number of days, with deviations from centre penalised, while picking a sample.. The observations 38 appear many times for example, however without an explainable reason (we are talking of work-distribution among nursing staff sample) So do "primes" "47, 37, 29" It is not to argue that they "shouldn't occur", but there has to be some reason for their being so significant/vibrant. At this stage we may conclude that most of the respondents may not have been under full-time nursing employments in strict sense of the term. 42, 48,72,60, 50,40 appearing more often would give us less variation but more regularity in the data. Since we haven't tried stratification, we do not know "how often they should occur". We thus do not re-draw observations.
ugoagwuJun 14, 20142KB1114
US Emissions of Greenhouse Gases Based on Global Warming Potential 1990-2007 Energy Information Administration.xls
U.S. Emissions of Greenhouse Gases, Based on Global Warming Potential, 1990-2007 Units are Million Metric Tons of Carbon Dioxide Equivalent Report #: DOE/EIA-0573(2007) Released Date: December 3, 2008   Next Release Date: November 2009 P = Preliminary Note: Data in this table are revised from the data contained in the previous EIA report, Emissions of Greenhouse Gases in the United States 2006, DOE/EIA-0573(2006) (Washington, DC, November 2007). Sources: Emissions of carbon dioxide, methane and nitrous oxide EIA. Emissions of HFCs, PFCs, and SF6, U.S. Environmental Protection Agency, preliminary data. Global Warming Potentials: United Nations, Intergovernmental Panel on Climate Change, Climate Change 2007 - The Physical Science Basis (Cambridge, UK: Cambridge University Press, 2007)
deathbysteveoMay 26, 2009813B2561
Diamond Ring Prices.xls
The source of the data is a full page advertisement placed in the Straits Times newspaper issue of February 29, 1992, by a Singapore-based retailer of diamond jewelry. The advertisement contained pictures of diamond rings and listed their prices, diamond content, and gold purity. Only 20K ladies' rings, each mounted with a single diamond stone, were considered for this study. 20K rings are made with gold of 20 carat purity. (Pure gold is rated as 24K.) There were 48 such rings of varying designs. The weights of the diamond stones ranged from 0.12 to 0.35 carats (a one carat diamond stone weighs 0.2 gram) and were priced between $223 and $1086. The jewelry store adopted a fixed-price policy. How Is Jewelry Priced? In Singapore, the pricing of gold jewelry is simple. The price equals the current market value of the gold content (i.e., weight times the going rate per gram of gold) plus a craftsmanship fee. However, the pricing of other jewelry like diamond rings is more complicated because they are not as standardized as gold jewelry. The price of diamond jewelry depends on the four C's: caratage, cut, colour, and clarity of the diamond stone. A good cut gives a diamond more sparkle. Colourless diamonds are the most prized. A flawless diamond has maximum clarity because the passage of light is unimpeded through the stone. Cut, colour, and clarity are subjective factors and are very hard for the layman to gauge.
craig_slinkmanApr 22, 2010586B2830
nc2005birth300.xls
A Random Sample of 300 births from the state of North Carolina. Plurarility refers to the number of children associated with the birth. Gender 1=Male, 2=Female. fage is age of father (years), mage is age of mother (years), visits is number of pre-natal medical visits, marital is 1=married, 2=unmarried, racemom is Race of Mother (0=Other Non-white, 1=White, 2=Black 3=American Indian, 4=Chinese, 5=Japanese, 6=Hawaiian, 7=Filipino, 8=Other Asian or Pacific Islander), hispmom is whether mother is of Hispanic origin (C=Cuban, M=Mexican, N=Non-Hispanic, O=Other and Unknown Hispanic, P=Puerto Rican, S=Central/South American, U=Not Classifiable), gained is weight gain during pregnancy (pounds), lowbw is if birth weight is 2500 grams or lower, tpounds is birthweight in pounds, smoke is 0=no, 1=yes for mother admitted to smoking, mature is 0=no, 1-yes for mother is 35 or older, premie is 0=no, 1=yes to being born 36 weeks or sooner.
jph422Nov 5, 200711KB959
gss2008-short.xls
The General Social Survey (GSS) conducts basic scientific research on the structure and development of American society with a data-collection program designed to both monitor social change within the United States and to compare the United States to other nations. The GSS data sets contain a standard ‘core’ of demographic and attitudinal questions, plus topics of special interest, representing the population of American adults, 18 years of age or older. More information about the GSS and its original data sets can be found at http://www.norc.org/GSS+Website/. This is part 1 of the original data.
bwachsmuth1Oct 9, 20121MB918
titanic_full.xls
VARIABLE DESCRIPTIONS: survival Survival (0 = No; 1 = Yes), pclass Passenger Class (1 = 1st; 2 = 2nd; 3 = 3rd), name Name, sex Sex, age Age, sibsp Number of Siblings/Spouses Aboard, parch Number of Parents/Children Aboard, ticket Ticket Number, fare Passenger Fare, cabin Cabin, embarked Port of Embarkation (C = Cherbourg; Q = Queenstown; S = Southampton), boat Lifeboat, body Body Identification Number home.dest Home/Destination.
swhardyOct 25, 2015110KB1406

1 2 3 4 5 6 7 8 9 10   >

Always Learning