StatCrunch logo (home)

Data sets shared by StatCrunch members
Showing 1 to 15 of 439 data sets matching Random
Data Set/Description Owner Last edited Size Views
Rent in 100 cities
Rent for 2 bedroom apartment in 100 randomly selected cities
m.smith96Sep 16, 2019455B96
North Carolina birth data
A Random Sample of 1000 births from the state of North Carolina. Plurarility refers to the number of children associated with the birth. Gender 1=Male, 2=Female. fage is age of father (years), mage is age of mother (years), visits is number of pre-natal medical visits, marital is 1=married, 2=unmarried, racemom is Race of Mother (0=Other Non-white, 1=White, 2=Black 3=American Indian, 4=Chinese, 5=Japanese, 6=Hawaiian, 7=Filipino, 8=Other Asian or Pacific Islander), hispmom is whether mother is of Hispanic origin (C=Cuban, M=Mexican, N=Non-Hispanic, O=Other and Unknown Hispanic, P=Puerto Rican, S=Central/South American, U=Not Classifiable), gained is weight gain during pregnancy (pounds), lowbw is if birth weight is 2500 grams or lower, tpounds is birthweight in pounds, smoke is 0=no, 1=yes for mother admitted to smoking, mature is 0=no, 1-yes for mother is 35 or older, premie is 0=no, 1=yes to being born 36 weeks or sooner.
jph422Sep 8, 200837KB5432
1970 Draft Lottery Data
In 1970, Congress instituted a random selection process for the military draft. All 366 possible birth dates were placed in plastic capsules in a rotating drum and were selected one by one. The first date drawn from the drum received draft number one and eligible men born on that date were drafted first. In a truly random lottery there should be no relationship between the date and the draft number. However, this dataset suggests that men born later in the year were more likely to be drafted.
cdcummings12Jun 1, 20108KB3358
Happiness Data from GSS.xls
These data come from the 2008 General Social Survey. A subset of 190 respondents were selected at random from the full data set. Children = number of children. Education is highest year of education (e.g., 12 = High School; 16 = Bachelors, etc.). Happy: 1 = Not too happy, 2 = Pretty Happy, 3 = Very Happy. Health: 1 = Poor, 2 = Fair, 3 = Good, 4 = Excellent. Income: 1 = Under $1000; 2 = $1000-2999; 3 = $3000-3999; 4 = $4000-4999; 5 = $5000-5999; 6 = $6000-6999; 7 = $7000-7999; 8 = $8000-9999; 9 = $10000-12499; 10 = $12500-14999; 11 = $15000-17499; 12 = $17500-19999; 13 = $20000-22499; 14 = $22500-24999; 15 = $25000-29999; 16 = $30000-34999; 17 = $35000-39999; 18 = $40000-49999; 19 = $50000-59999; 20 = $60000-74999; 21 = $75000-$89999; 22 = $90000-$109999; 23 = $110000-$129999; 24 = $130000-$149999; 25 = $150000+. Married: 0 = No, 1 = Yes. Religious: 1 = Not religious, 2 = Slightly religious, 3 = Moderately religious, 4 = Very religious.
jacobgsimonsApr 20, 20105KB4072
North Carolina premature births
A Random Sample of 1000 births from the state of North Carolina. Plurarility refers to the number of children associated with the birth. Gender 1=Male, 2=Female. fage is age of father (years), mage is age of mother (years), visits is number of pre-natal medical visits, marital is 1=married, 2=unmarried, racemom is Race of Mother (0=Other Non-white, 1=White, 2=Black 3=American Indian, 4=Chinese, 5=Japanese, 6=Hawaiian, 7=Filipino, 8=Other Asian or Pacific Islander), hispmom is whether mother is of Hispanic origin (C=Cuban, M=Mexican, N=Non-Hispanic, O=Other and Unknown Hispanic, P=Puerto Rican, S=Central/South American, U=Not Classifiable), gained is weight gain during pregnancy (pounds), lowbw is if birth weight is 2500 grams or lower, tpounds is birthweight in pounds, smoke is 0=no, 1=yes for mother admitted to smoking, mature is 0=no, 1-yes for mother is 35 or older, premie is 0=no, 1=yes to being born 36 weeks or sooner.
statcrunchhelpApr 10, 20144KB2239
BODYMEAS.XLS
Random Sample of 100 observations from NHANES (which contains more observations). GENDER (1=Male, 2=Female), AGE (years), WEIGHTENG (inches), HEIGHTENG (inches), SIXFOOT (0=No, 1=Yes to being 72 inches or taller), LEGENG (Leg length inches), WAISTENG (Waist circumference inches), THIGHENG (Thigh circumference inches), WAIST28 (0=No, 1=Yes to having waist 28 inches or smaller), HEIGHT65 (0=No, 1=Yes to being 65 inches tall or shorter), BMI30 (0=No, 1=Yes to having Body Mass Index 30 or higher), OVER200 (0=No, 1=Yes to weighing 200 pounds or more).
jph422Sep 16, 20084KB3884
RegisteredNursesSurvey.xlsx
For what survey produced it, see http://www.statcrunch.com/5.0/survey.php?surveyid=8178&code=YINVQ and inputs of all team mates. Towards the end, some validation was done, deleting data where working hours was less than a work day, or outliers to legally admissible work days. Finally arbitarily long chains which were less likely to be encountered in draws of simulated data (M/F, Degrees etc.. were discarded). A total of 12 observations were thus thrown out. All Credit goes to Team 3,the Instructor, our unnamed Friends in the Nursing profession who enthusiastically did a last minute push through over their extended social media groups for data and the respondents who kindly took out time for the survey. Another thought is about the distribution of hours worked. Wven if random, it "should be" "centered on" certain hours a day* number of days, with deviations from centre penalised, while picking a sample.. The observations 38 appear many times for example, however without an explainable reason (we are talking of work-distribution among nursing staff sample) So do "primes" "47, 37, 29" It is not to argue that they "shouldn't occur", but there has to be some reason for their being so significant/vibrant. At this stage we may conclude that most of the respondents may not have been under full-time nursing employments in strict sense of the term. 42, 48,72,60, 50,40 appearing more often would give us less variation but more regularity in the data. Since we haven't tried stratification, we do not know "how often they should occur". We thus do not re-draw observations.
ugoagwuJun 14, 20142KB1094
Class Seating vs Grade
From Body Image Data Set: "A student survey was conducted at a major university. Data were collected from a random sample of 239 undergraduate students". Variables: Gender - Male or Female, GPA - Student's cumulative college GPA. GPA is then converted to Grades (where, 4.33 = A+, 4.00 = A, 3.67 = A-, 3.33 = B+, 3.00 = B, 2.67 = B-, 2.33 = C+, 2.00 = C, 1.67 = C-). Seat - Typical classroom seat location (Front & Back)
mallirhea86Oct 26, 20182KB3942
nc2005birth300.xls
A Random Sample of 300 births from the state of North Carolina. Plurarility refers to the number of children associated with the birth. Gender 1=Male, 2=Female. fage is age of father (years), mage is age of mother (years), visits is number of pre-natal medical visits, marital is 1=married, 2=unmarried, racemom is Race of Mother (0=Other Non-white, 1=White, 2=Black 3=American Indian, 4=Chinese, 5=Japanese, 6=Hawaiian, 7=Filipino, 8=Other Asian or Pacific Islander), hispmom is whether mother is of Hispanic origin (C=Cuban, M=Mexican, N=Non-Hispanic, O=Other and Unknown Hispanic, P=Puerto Rican, S=Central/South American, U=Not Classifiable), gained is weight gain during pregnancy (pounds), lowbw is if birth weight is 2500 grams or lower, tpounds is birthweight in pounds, smoke is 0=no, 1=yes for mother admitted to smoking, mature is 0=no, 1-yes for mother is 35 or older, premie is 0=no, 1=yes to being born 36 weeks or sooner.
jph422Nov 5, 200711KB920
AMSTAT Census at School
This is a random sample (n=250) from the AMSTAT Census at School classroom project.
squesenJul 14, 201520KB1489
Cell Phone OLI
Math Math SAT score Verbal Verbal SAT score Credits Number of credits the student is registered for Year Year in college (1=Freshman, 2=Sophomore, 3=Junior, 4=Senior) Exer Time (in minutes) spent exercising in a typical day Sleep Time (in hours) spent sleeping in a typical day Veg Are you a vegetarian (yes, no, some) Cell Do you own a cell phone (yes, no) Cell Phones College students at a large state university completed a survey about their academic and personal life. Questions ranged from "How many credits are you registered for this semester?" to "Would you define yourself as a vegetarian?" Four sections of an introductory statistics course were chosen at random from all the sections of introductory statistics courses offered at the university in the semester when the survey was conducted, and the 312 students who completed the survey were students registered in one of the four chosen sections. In this exercise, we will use a subset of variables from the survey and use the collected data to answer three questions. Note that (1) these are real data, and (2) the symbol * in the worksheet means that this observation is not available (this is known as a "missing value").
corp_richardMay 2, 20168KB1307
Effect of Smoke on infants
Data was collected by a random survey of mothers in KY through a dance studio during November 2010 by SABRINA LAFFERTY & KAREN HOLLAND (ST 291 Fall 2010 candidates at HCTC) as a requirement for semester project. They asked 57 mothers about the gestation period for their pregnancies, the birth weight, the length of their newborns and whether they smoked while they were pregnant.
statcrunchhelpMar 6, 20141KB7252
Asking prices for 4-bedroom homes in Bryan-College Station TX
Random sample of 30 four-bedroom homes listed for sale in the Bryan-College Station, Texas, area. For each home, the data set contains the list price in thousands of dollars (Price), square footage (Sqft), number of bathrooms (Baths) and location (Bryan, TX or College Station, TX).
statcrunchhelpApr 4, 2014951B4447
Comparing two drugs
The basic practice of statistics: instructor's edition. David S. Moore - William Notz - Michael A. Fligner - R. Scott Linder - W.H. Freeman and Co. – 2013 (p. 462) 18.50 Comparing two drugs. Makers of generic drugs must show that they do not differ significantly from the “reference” drugs that they imitate. One aspect in which drugs might differ is their extent of absorption in the blood. Table 18.6 gives data taken from 20 healthy nonsmoking male subjects for one pair of drugs. This is a matched pairs design. Numbers 1 to 20 were assigned at random to the subjects. Subjects 1 to 10 received the generic drug first, followed by the reference drug. Subjects 11 to 20 received the reference drug first, followed by the generic drug. In all cases, a washout period separated the two drugs so that the first had disappeared from the blood before the subject took the second. By randomizing the order, we eliminate the order in which the drugs were administered from being confounded with the difference in the absorption in the blood. Do the drugs differ significantly in the amount absorbed in the blood? Table 18.6 Absorption extent for two versions of a drug
phil_larsonApr 9, 2013290B1639
Granola comparison
Ten subjects in this fictional study were each asked to sample three kinds of granola cereal, labelled simply "A", "B", and "C", and to rate the granola's taste on a scale of 1 to 10. Each subject was given the three granola samples in random order.
statcrunchhelpApr 19, 2016223B1016

1 2 3 4 5 6 7 8 9 10   >

Always Learning