Public profile for ds-231%sc
Shared data sets | Shared results | Shared reports
Showing 1 to 31 of 31 data sets
Properties of 60 Standard Metropolitan Statistical Areas (a standard Census Bureau designation of the region around a city) in the United States, collected from a variety of sources.
The data include information on the social and economic conditions in these areas, on their climate, and some indices of air pollution potentials.
Number of cases: 60 Reference: U.S. Department of Labor Statistics
[ Outlier , Transformation , Regression]
|JanTemp||Mean January temperature (degrees Farenheit)|
|JulyTemp||Mean July temperature (degrees Farenheit) |
|RelHum||Relative Humidity |
|Rain|| Annual rainfall (inches) |
|Mortality||Age adjusted mortality |
|Education|| Median education |
|PopDensity|| Population density |
|%NonWhite|| Percentage of non whites |
|%WC|| Percentage of white collar workers |
|pop|| Population |
|pop/house|| Population per household |
|income||Median income |
|HCPot|| HC pollution potential |
|NOxPot||Nitrous Oxide pollution potential |
|SO2Pot|| Sulfur Dioxide pollution potential |
|NOx|| Nitrous Oxide ||yes||ds-231%sc||Aug 11, 2008||5KB||154|
|Smoking and Cancer|
The data are per capita numbers of cigarettes smoked (sold) by 43 states and the
District of Columbia in 1960 together with death rates per thouusand population from
various forms of cancer.
Number of cases: 44 Reference: J.F. Fraumeni, "Cigarette Smoking and Cancers of the Urinary Tract: Geographic Variations in the United States," Journal of the National Cancer Institute, 41, 1205-1211.
[Outlier , Regression , Residuals , Transformation , Nonlinear regression , Dummy variable]
|CIG || Number of cigarettes smoked (hds per capita)|
|BLAD ||Deaths per 100K population from bladder cancer |
|LUNG || Deathes per 100K population from lung cancer|
|KID ||Deaths per 100K population from bladder cancer |
|LEUK || Deaths per 100 K population from leukemia ||yes||ds-231%sc||Aug 11, 2008||1KB||253|
|Glove Use Among Nurses|
Data from an experiment to see how an educational program on the importance of using gloves affected the rate of glove use by a group of nurses in an inner-city pediatric hospital emergency department. Without their knowledge, the nurses were observed during vascular access procedures before and one, two, and five months after an educational program to see how often they wore gloves. Each procedure by a nurse was counted as a separate observation.
Missing values are indicated by large dots.
Number of cases: 23
Reference: Friedland, L., Joffe, M., Moore, D. , et al. (1992), "Effect of Educational Program on Compliance With Glove Use in a Pediatric Emergency Department," American Journal of Diseases of Childhood, 146, 1355-1358.
|Period||Observation period (1 = before intervention, 2 = one month after intervention, 3 = two months after, 4 = 5 months after intervention)|
|Observed||Number of times the nurse was observed |
|Gloves|| Number of times the nurse used gloves |
|Experience|| Years of experience of nurse ||yes||ds-231%sc||Aug 11, 2008||976B||109|
|Improving Reading Ability|
Results of an experiment to test whether directed reading activities in the classroom help elementary school students improve aspects of their reading ability. A treatment class of 21 third-grade students participated in these activities for eight weeks, and a control class of 23 third-graders followed the same curriculum without the activities. After the eight-week period, students in both classes took a Degree of Reading Power (DRP) test which measures the aspects of reading ability that the treatment is designed to improve.
Number of cases: 44
Reference: Moore, David S., and George P. McCabe (1989). Introduction to the Practice of Statistics[Two sample t-test , Summary statistics]
|Treatment|| Whether student participated in activities (treated) or not (control) |
|Response|| Score on Degree of Reading Power test ||yes||ds-231%sc||Aug 11, 2008||571B||141|
| Cancer Survival|
Patients with advanced cancers of the stomach, bronchus, colon, ovary or breast were treated with ascorbate. The purpose of the study was to determine if the survival times differ with respect to the organ
affected by the cancer.
Number of cases: 64
Reference:Cameron, E. and Pauling, L. (1978) Supplemental ascorbate in the supportive treatment of cancer: re-evaluation of prolongation of survival times in terminal human cancer. Proceedings of the National Academy of Science USA, 75, 4538Ð4542. Also found in: Manly, B.F.J. (1986) Multivariate Statistical Methods: A Primer, New York: Chapman & Hall, 11. Also found in: Hand, D.J., et al. (1994) A Handbook of Small Data Sets, London: Chapman & Hall, 255.
[ANOVA , Boxplot , Transformation]
|Survival|| Survival time (in days?)|
|Organ|| Organ affected by the cancer ||yes||ds-231%sc||Aug 11, 2008||817B||166|
Births per 10,000 23-year-old women in the United States from 1917-1975.
Number of cases: 59
Reference: P.K. Whelpton and A. A. Campbell, "Fertility Tables for Birth Charts of American Women," Vital Statistics Special Reports 51, no. 1. (Washington D.C.:Government Printing Office, 1960, years 1917-1975). National Center for Health Statistics, Vital Statistics of the United States Vol. 1, Natality (Washington D.C.:Government Printing Office, yearly, 1958-1975). [ Scatterplot , Time series]
| Birthrate:|| Births per 10,000 23-year-old women in the US from 1917-1975 |
| Year:|| The year ||yes||ds-231%sc||Aug 11, 2008||721B||72|
|Predicting Retail Sales|
These data are published monthly in the statistical section of the Survey of
Number of cases: 44 Reference: U.S. Department of Commerce, Survey of Current Business
[Regression , Residuals , Time series]
|TIME|| Quarter, from 1st quarter 1979 to 4th quarter 1989 |
|WASA|| National income wage and salary disbursements ($ billions) |
|EMPL||Employees on payrolls of non-agricultural establishments (thousands) |
|BLDG|| Building material dealer sales ($ millions)|
|AUTO||Automotive dealer sales ($ millions) |
|FURN||Furniture and home furnishings dealer sales ($ millions) |
|GMER||General merchandise dealer sales ($ millions) ||yes||ds-231%sc||Aug 11, 2008||2KB||101|
Description: Average salary paid to teachers and expenditures per pupil on education in the 50 states and the District of Columbia.
Number of cases: 51
Reference: Moore, David S., and George P. McCabe (1989). Introduction to the Practice of Statistics
[ANCOVA , ANOVA , Scatterplot]
|State|| State |
|Region|| Region |
|Pay||Amount of pay in thousands |
|Spend||Average amount spent per student in thousands ||yes||ds-231%sc||Aug 11, 2008||1KB||84|
Percentage of Entering Class Graduating on Time. A large university reports the percentage of the entering Freshman class graduating on time in each of 8 years from each of 6 separate colleges making up the university. The years cover a period of war protest and other upheavals that may have disrupted some student's education plans.
Number of cases: 48
Reference: This data is distributed with the software package, Data Desk¨. Data Description, Inc. (1993). Data Desk¨. Ithaca, NY: Data Description, Inc.
[Methods: Scatterplot , Time series]
|School|| Code for college |
|%_grad_on_time||The percentage of entering class that graduated on time |
|Year||Year of entering class ||yes||ds-231%sc||Aug 11, 2008||612B||43|
|Reading Test Scores|
Data from a study of the effect of three different methods of instruction on reading comprehension in children. Participants were given a reading comprehension test before and after receiving the instruction.
Number of cases: 66 Reference: Moore, David S., and George P. McCabe (1989). Introduction to the Practice of Statistics[ANOVA]
|Group||Type of instruction that student received (Basal, DRTA, or Strat) |
|PRE1|| Pretest score on first reading comprehension measure |
|PRE2||Pretest score on second reading comprehension measure |
|POST1|| Posttest score on first reading comprehension measure |
|POST2|| Posttest score on second reading comprehension measure|
|POST3|| Posttest score on third reading comprehension measure ||yes||ds-231%sc||Aug 11, 2008||1KB||75|
Heights of singers in the NY Choral Society in 1979. Self-report, to the nearest inch. Voice parts in order from highest pitch to lowest pitch are Soprano, Alto, Tenor, Bass. The first two are female voices and the last two are male voices. The original dataset included two divisions for each voice part. This dataset reports only soprano 1, alto 1, tenor 1, and bass 1 from the original dataset. Reference: Chambers, Cleveland, Kleiner, and Tukey. (1983). Graphical Methods for Data Analysis[Pooled t-test , ANOVA , Boxplot]
|Soprano||Heights of sopranos (in inches) |
|Alto||Heights of altos (in inches) |
|Tenor|| Heights of tenors (in inches) |
|Bass||Heights of basses (in inches) ||yes||ds-231%sc||Aug 11, 2008||485B||180|
Percent of a Standard 50-word list heard correctly in the presence of background noise. 24 subjects with normal hearing listened to standard audiology tapes of English words at low volume with a noisy background. They repeated the words and were scored correct or incorrect in their perception of the words. The order of list presentation was randomized.
The word lists are standard audiology tools for assessing hearing. They are calibrated to be equally difficult to perceive. However, the original calibration was performed with normal-hearing subjects and no noise background. The experimenter wished to determine whether the lists were still equally difficult to understand in the presence of a noisy background.
Number of cases: 96
Reference: Loven, Faith. (1981). A Study of the Interlist Equivalency of the CID W-22 Word List Presented in Quiet and in Noise. Unpublished MS Thesis, University of Iowa.
|SubjectID|| Code for each subject - 24 of them |
|ListID||Code for each list played |
|Hearing|| Score received on hearing test ||yes||ds-231%sc||Aug 11, 2008||1KB||30|
|Nursing Home Data|
The data were collected by the Department of Health and Social Services of the
State of New Mexico and cover 52 of the 60 licensed nursing facilities in New Mexico
Number of cases: 52
Reference: These data are part of the data analyzed in Howard L. Smith, Niell F. Piland, and Nancy Fisher, "A Comparison of Financial Performance, Organizational Character- istics, and Management Strategy Among Rural and Urban Nursing Facilities, Journal of Rural Health, Winter 1992, pp 27-40.
[ T-test , Outlier , Boxplot , Mann Whitney U test , Summary statistics]
|BED ||number of beds in home |
|MCDAYS || annual medical in-patient days (hundreds)|
|TDAYS || annual total patient days (hundreds) |
|PCREV || annual total patient care revenue ($hundreds)|
|NSAL || annual nursing salaries ($hundreds) |
|FEXP ||annual facilities expenditures ($hundreds)|
|RURAL || rural (1) and non-rural (0) homes ||yes||ds-231%sc||Aug 11, 2008||4KB||155|
|Voting for the President|
Percent of the popular vote that was won by the Democratic presidential candidates in the 1980 and 1984 elections. Both candidates, Jimmy Carter in 1980 and Walter Mondale in 1984, were defeated by the Republican Ronald Reagan.
Number of cases: 50 Reference: Moore, David S., and George P. McCabe (1989). Introduction to the Practice of Statistics[Dummy variable , Regression , Scatterplot]
|Dem1980|| Percent of the presidential votes won by the Democratic candidate in 1980 |
|Dem1984|| Percent of the presidential votes won by the Democratic candidate in 1984 ||yes||ds-231%sc||Aug 11, 2008||805B||22|
These data are crime-related and demographic statistics for 47 US states in 1960. The data were collected from the FBI's Uniform Crime Report and other government agencies to determine how the variable crime rate depends on the other variables measured in the study.
Number of cases: 47 Reference:Vandaele, W. (1978) Participation in illegitimate activities: Erlich revisited. In Deterrence and incapacitation, Blumstein, A., Cohen, J. and Nagin, D., eds., Washington, D.C.: National Academy of Sciences, 270-335. Methods: A Primer, New York: Chapman & Hall, 11. Also found in: Hand, D.J., et al. (1994) A Handbook of Small Data Sets, London: Chapman & Hall, 101-103.
[Collinearity , Correlation , Causation , Lurking variable , Regression]
|R|| Crime rate # of offenses reported to police per million population |
|Age|| The number of males of age 14-24 per 1000 population |
|S|| Indicator variable for Southern states (0 = No, 1 = Yes) |
|Ed|| Mean # of years of schooling x 10 for persons of age 25 or older |
|Ex0|| 1960 per capita expenditure on police by state and local government |
|Ex1|| 1959 per capita expenditure on police by state and local government |
|LF|| Labor force participation rate per 1000 civilian urban males age 14-24 |
|M||The number of males per 1000 females |
|N||State population size in hundred thousands |
|NW|| The number of non-whites per 1000 population |
|U1||Unemployment rate of urban males per 1000 of age 14-24 |
|U2|| Unemployment rate of urban males per 1000 of age 35-39 |
|W|| Median value of transferable goods and assets or family income in tens of $ |
|X|| The number of families per 1000 earning below 1/2 the median income ||yes||ds-231%sc||Aug 11, 2008||2KB||456|
The following data set represent the pulse rates for 24 randomly selected individuals
|yes||ds-231%sc||Aug 11, 2008||116B||20|
|Hospital Data||yes||ds-231%sc||Aug 11, 2008||392B||16|
|secretary data||yes||ds-231%sc||Aug 11, 2008||48B||13|
|Beers Data||yes||ds-231%sc||Aug 11, 2008||218B||36|
|Alcohol Consumption||yes||ds-231%sc||Aug 11, 2008||181B||29|
|CEO Golf and Stock Data|
Data from New York Times (31 May 1998, Section 3, p 1) reporting
correlation between CEO's golf handicaps and performance of their
|yes||ds-231%sc||Aug 11, 2008||3KB||58|
|Magazine Ads Readability|
Thirty magazines were ranked by educational level of their readers. Three magazines were randomly selected from the first, second, and third ten magazines. Six advertise- ments were randomly selected from each of the nine selected magazines. The magazines were Group 1 Highest educational level: 1. Scientific American 2. Fortune 3. The New Yorker Group 2 Medium educational level: 4. Sports IIlustrated 5. Newsweek 6. People Group 3 Lowest educational level : 7. National Enquirer 8. Grit 9 True Confessions For each advertisement, the data below were observed.
Number of cases: 54
Reference: F.K. Shuptrine and D.D. McVicker, "Readability Levels of Magazine Ads," Journal of Advertising Research, 21:5 (October 1981), p 47.[
|WDS || number of words in advertisement copy |
|SEN || number of sentences in advertising copy |
|3SYL || number of 3+ syllable words in advertising copy |
|MAG || magazine (1 through 9 as above) |
|GRP || educational level (as above) ||yes||ds-231%sc||Aug 11, 2008||801B||115|
|Wages and Hours|
The data are from a national sample of 6000 households with a male head earning less than $15,000 annually in 1966. The data were clasified into 39 demographic groups for analysis. The study was undertaken in the context of proposals for a guaranteed annual wage (negative income tax). At issue was the response of labor supply (average hours) to increasing hourly wages. The study was undertaken to estimate this response from available data [ Regression , Outlier , Collinearity , Assumptions, regression]
|HRS||Average hours worked during the year |
|WAGE|| Average hourly wage ($) |
|ERSP|| Average yearly earnings of spouse ($) |
|ERNO|| Average yearly earnings of other family members ($) |
|NEIN|| Average yearly non-earned income |
|ASSET|| Average family asset holdings (Bank account, etc.) ($) |
|AGE|| Average age of respondent |
|DEP|| Average number of dependents |
|RACE||Percent of white respondents |
|SCHOOL|| Average highest grade of school completed ||yes||ds-231%sc||Aug 11, 2008||2KB||226|
| Ice Cream Consumption|
Ice cream consumption was measured over 30 four-week periods from March 18, 1951 to July 11, 1953. The purpose of the study was to determine if ice cream consumption depends on the variables price, income, or temperature. The variables Lag-temp and Year have been added to the original data.
Number of cases: 30
Reference: Koteswara Rao Kadiyala (1970) Testing for the independence of regression disturbances. Econometrica, 38, 97-117. Also found in: Hand, D.J., et al. (1994) A Handbook of Small Data Sets, London: Chapman & Hall, 214.
[Regression , Time series ]
|Date|| Time period (1-30) of the study (from 3/18/51 to 7/11/53) |
|IC|| Ice cream consumption in pints per capita |
|Price|| Price of ice cream per pint in dollars |
|Income|| Weekly family income in dollars |
|Temp|| Mean temperature in degrees F. |
|Lag-temp|| Temp variable lagged by one time period |
|Year|| Year within the study (0 = 1951, 1 = 1952, 2 = 1953) ||yes||ds-231%sc||Aug 11, 2008||802B||207|
|Agricultural Economics Studies|
Price and consumption per capita of beef and pork annually from 1925 to 1941 together with other variables relevant to an economic analysis of price and/or consumption of beef and pork over the period.
Number of cases: 17
Reference: F.B. Waugh, Graphic Analysis in Agricultural Economics, Agricultural Handbook No. 128, U.S. Department of Agriculture, 1957.
(Regression , Multivariate regression , Time series )
|PBE || Price of beef (cents/lb) |
|CBE || Consumption of beef per capita (lbs) |
|PPO || Price of pork (cents/lb) |
|CPO || Consumption of pork per capita (lbs) |
|PFO || Retail food price index (1947-1949 = 100) |
|DINC || Disposable income per capita index (1947-1949 = 100) |
|CFO || Food consumption per capita index (1947-1949 = 100) |
|RDINC || Index of real disposable income per capita (1947-1949 = 100) |
|RFP || Retail food price index adjusted by the CPI (1947-1949 = 100)||yes||ds-231%sc||Aug 11, 2008||982B||173|
|Predicting Appliance Sales|
The file gives unit shipments of dishwashers, disposers, refrigerators, and washers in the United States from 1960 to 1985. This and other data are published currently in the Department of Commerce's Survey of Current Business, and are summarized from time to time in their publication, Business Statistics. Also included in the file are durable goods expenditures and private residential investment in the United States.
Number of cases: 26
Reference: Business Statistics, U.S. Department of Commerce
[ Regression , Time series ]
|YEAR: ||1960 to 1985|
|DISH:|| Factory shipments (domestic) of dishwashers (thousands) |
|DISP:|| Factory shipments (domestic) of disposers (thousands) |
|FRIG:|| Factory shipments (domestic) of refrigerators (thousands) |
|WASH:|| Factory shipments (domestic) of washing machines (thousands) |
|DUR:|| Durable goods expenditures (billions of 1972 dollars) |
|RES:|| Private residential investment (billions of 1972 dollars) ||yes||ds-231%sc||Aug 11, 2008||1.003B||36|
The data are a random sample of records of resales of homes from Feb 15 to Apr 30, 1993 from the files maintained by the Albuquerque Board of Realtors. This type of data is collected by multiple listing agencies in many cities and is used by realtors as an information base.
Number of cases: 117
Reference: Albuquerque Board of Realtors [ Diagnostics , Dummy variable.[Interaction , Regression]
|PRICE || Selling price ($hundreds) |
|SQFT || Square feet of living space |
|AGE || Age of home (years) |
|FEATS || Number out of 11 features (dishwasher, refrigerator, microwave, disposer, washer, intercom, skylight(s), compactor, dryer, handicap fit, cable TV access |
|NE || Located in northeast sector of city (1) or not (0) |
|COR || Corner location (1) or not (0) |
|TAX || Annual taxes ($) ||yes||ds-231%sc||Aug 11, 2008||3KB||74|
Data on several variable of different brands of cereal.
A value of -1 for nutrients indicates a missing observation.
Number of cases: 77
Reference: Data available at many grocery stores
[ Histogram , Scatterplot , Regression]
|Name|| Name of cereal |
|mfr||Manufacturer of cereal where A = American Home Food Products; G = General Mills; K = Kelloggs; N = Nabisco; P = Post; Q = Quaker Oats; R = Ralston Purina |
|type|| cold or hot |
|calories|| calories per serving|
|protein||grams of protein |
|fat|| grams of fat |
|sodium||milligrams of sodium |
|fiber||grams of dietary fiber |
|carbo|| grams of complex carbohydrates |
|sugars||grams of sugars |
|potass|| milligrams of potassium |
|vitamins|| vitamins and minerals - 0, 25, or 100, indicating the typical percentage of FDA recommended |
|shelf|| display shelf (1, 2, or 3, counting from the floor) |
|weight|| weight in ounces of one serving |
|cups||number of cups in one serving |
|rating|| a rating of the cereals ||yes||ds-231%sc||Aug 11, 2008||5KB||179|
Small companies were defined as those with annual sales greater than five and less than $350 million. Companies were ranked according to 5-year average return on investment. This data covers the first 60 ranked firms.
Reference: Forbes, November 8, 1993, "America's Best Small Companies,".
[Outlier , Histogram , Mean , Median , Boxplot , Distribution]
|Age: ||Age of chief executive officer
|Sal:|| Salary of chief executive officer (including bonuses), $thousands ||yes||ds-231%sc||Aug 11, 2008||681B||181|
| Hot dogs|
Results of a laboratory analysis of calories and sodium content of major hot dog brands. Researchers for Consumer Reports analyzed three types of hot dog: beef, poultry, and meat (mostly pork and beef, but up to 15% poultry meat).
|Type||types of hot dog|
|Calories||calories content |
|Sodium||sodium content ||yes||ds-231%sc||Aug 11, 2008||832B||104|
| Brain Size and Intelligence|
Willerman et al. (1991) collected a sample of 40 right-handed Anglo introductory psychology students at a large southwestern university. Subjects took four subtests (Vocabulary, Similarities, Block Design, and Picture Completion) of the Wechsler (1981) Adult Intelligence Scale-Revised. The researchers used Magnetic Resonance Imaging (MRI) to determine the brain size of the subjects. Information about gender and body size (height and weight) are also included. The researchers withheld the weights of two subjects and the height of one subject for reasons of confidentiality.
Reference: Willerman, L., Schultz, R., Rutledge, J. N., and Bigler, E. (1991), "In Vivo Brain Size and Intelligence," Intelligence, 15, 223-228.
[Correlation , Regression , Scatterplot]
|Gender|| Male or Female |
|FSIQ|| Full Scale IQ scores based on the four Wechsler (1981) subtests |
|VIQ|| Verbal IQ scores based on the four Wechsler (1981) subtests|
|PIQ|| Performance IQ scores based on the four Wechsler (1981) subtests|
|Weight||body weight in pounds |
|Height||height in inches |
|MRI_Count|| total pixel Count from the 18 MRI scans ||yes||ds-231%sc||Aug 11, 2008||1KB||599|