CEO salaries
The age and annual salary of the chief executive officers for 60 small highly ranked firms.
AGEage of chief executive officer
SALARYsalary of chief executive officer (including bonuses), in $thousands
sampleuserMay 25, 2007428B1654
Yale hair/eye color data
Twenty people were asked to list their eye color and hair color
hairhair color
eyeeye color
sampleuserMay 25, 2007242B254
US Crime data from 1960
These data are crime-related and demographic statistics for 47 US states in 1960. The data were collected from the FBI's Uniform Crime Report and other government agencies to determine how the variable crime rate depends on the other variables measured in the study.
1. Vandaele, W. (1978) Participation in illegitimate activities: Erlich revisited. In Deterrence and incapacitation, Blumstein, A., Cohen, J. and Nagin, D., eds., Washington, D.C.
2. National Academy of Sciences, 270-335. Methods: A Primer, New York: Chapman & Hall, 11.
3. Hand, D.J., et al. (1994) A Handbook of Small Data Sets, London: Chapman & Hall, 101-103.
The data set was copied from
R Crime rate: # of offenses reported to police per million population
Age The number of males of age 14-24 per 1000 population
S Indicator variable for Southern states (0 = No, 1 = Yes)
Ed Mean # of years of schooling times 10 for persons of age 25 or older
Ex0 1960 per capita expenditure on police by state and local government
Ex1 1959 per capita expenditure on police by state and local government
LF Labor force participation rate per 1000 civilian urban males age 14-24
M The number of males per 1000 females
N State population size in hundred thousands
NW The number of non-whites per 1000 population
U1 Unemployment rate of urban males per 1000 of age 14-24
U2 Unemployment rate of urban males per 1000 of age 35-39
W Median value of transferable goods and assets or family income in tens of dollars
X The number of families per 1000 earning below 1/2 the median income
sampleuserMay 25, 20072KB517
Twin weights
Weights for 19 newborn twins born to members of the Greater Columbia South Carolina Area Mothers of Twins Club from September 2000 to December 2001.
TwinAweight of the first born twin in pounds
TwinB weight of the second born twin in pounds
Type gender combination for the twins. For boy/girl sets, BG means the boy was born first and the girl was born second, and GB means the girl was born first and the boy was born second, etc..
sampleuserMay 25, 2007317B665
TV and life expectancy
Table of life expectancies and number of people per television set in each of a number of countries. Reference: The 1993 World Almanac and Book of Facts, pp. 727-817
CountryName of the country
Life_explife expectancy in years for residents
Per_TVNumber of residents per televison
sampleuserMay 25, 2007363B193
State statistics from 1996
Various 1996 statistics for all 50 U.S. states and the District of Columbia. See Tables 761, 337, 307, 42 and 744 of the 1998 Statistical Abstract of the United States.
STATEstate abbreviation
POVERTY percentage of the state population living in poverty
CRIME violent crime rate per 100,000 population
COLLEGE percentage of states population in a certain age range who are enrolled full time in college
METRO percentage of the state population living in a metropolitan area
INCOME median household income in 1996 dollars
sampleuserMay 25, 20071KB430
STAT 509 class data
Real data collected from students taking STAT 509 (Statistics for Engineers and Scientists) at the University of South Carolina. A * indicates a missing (unavailable) observation.
shoeshoe size
gender(0 = Male, 1 = Female)
state(0 = out of state, 1= in state)
field 1 = computer science / information systems
2 = math
3 = engineering
4 = other
majorUSC code for major
heartheartbeat (per minute)
sampleuserMay 25, 2007580B185
STAT 110 class data
Data collected from a STAT 110 course at the University of South Carolina in the Spring of 2002.
SEX sex of the student
HAND right-handed or left-handed student
EYES student eye color (BL = BLUE, GR = GREEN, BR = BROWN, OT = OTHER)
GRADE grade on Exam1
STUDY time student studied for exam 1 (in minutes)
sampleuserMay 25, 2007684B198
Smoking and cancer
Per capita numbers of cigarettes smoked (sold) by 43 states and the District of Columbia in 1960 and death rates per thousand population from various forms of cancer. Reference: J.F. Fraumeni, Cigarette Smoking and Cancers of the Urinary Tract: Geographic Variations in the United States, Journal of the National Cancer Institute, 41, 1205-1211.
STATE state abbreviation
CIG Number of cigarettes smoked (hds per capita)
BLAD Deaths per 100K population from bladder cancer
LUNG Deaths per 100K population from lung cancer
KID Deaths per 100K population from bladder cancer
LEUK Deaths per 100K population from leukemia
sampleuserMay 25, 20071KB600
Reading data
Results of an experiment to test whether directed reading activities in the classroom help elementary school students improve aspects of their reading ability. A treatment class of 21 third-grade students participated in these activities for eight weeks, and a control class of 23 third-graders followed the same curriculum without the activities. After the eight-week period, students in both classes took a Degree of Reading Power (DRP) test which measures the aspects of reading ability that the treatment is designed to improve.
Reference: Moore, David S., and George P. McCabe (1989). Introduction to the Practice of Statistics.
Original source: Schmitt, Maribeth C., The Effects on an Elaborated Directed Reading Activity on the Metacomprehension Skills of Third Graders, Ph.D. dissertation, Purdue University, 1987.
Control Control group values
Treatment Treatment group values
sampleuserMay 25, 2007156B372
Random digits
1000 pseudorandom digits (0-9) generated using a random number generator.
digit Integer between 0 and 9
sampleuserMay 25, 20072KB95
Each day, a sample of 100 units is selected, with the number of defective recorded for each sample. Reference: Johnson, R. A. (1994) Miller and Freund's Probability and Statistics for Engineers, Fifth Edition. Prentice Hall: Englewood Cliffs, New Jersey.
defectsnumber of defective recorded for each sample
sampleuserMay 25, 200798B151
Each hour, a sample of four bearings is chosen from a manufacturing process. The observation is (x - 0.9750)/0.0001, where x is the diameter of the bearing in inches.
Reference: Johnson, R. A. (1994) Miller and Freund's Probability and Statistics for Engineers, Fifth Edition. Prentice Hall: Englewood Cliffs, New Jersey.
obs1first observation
obs2second observation
obs3third observation
obs4fourth observation
sampleuserMay 25, 2007323B348
Pi digits
This is a constructed/fabricated data set to test accuracy in summary statistic calculations. The numbers are the first 5000 digits of the mathematical constant pi (= 3.1415926535897932384...).
digitsuccessive digits of pi
sampleuserMay 25, 200710KB57
PACT data
Summary information for the performance of South Carolina school districts on the PACT grade three exam in 2000.
DISTRICT SC school district
YEAR Year administered (2000 for all)
ADM Number of students administered to
LUNCH Percent of students on free/reduced lunch, measure of poverty
ETESTED Number tested in English
EngMean Mean score for English
BB_E Percent scoring ?Below Basic? in English
BA_E Percent scoring ?Basic? in English
PF_E Percent scoring ?Proficient? in English
AD_E Percent scoring ?Advanced? in English
MTESTED Number tested in Math
MathMean Mean score for Math
BB_M Percent scoring ?Below Basic? in Math
BA_M Percent scoring ?Basic? in Math
PF_M Percent scoring ?Proficient? in Math
AD_M Percent scoring ?Advanced? in Math
sampleuserMay 25, 20077KB134
Old Faithful
Data for eruptions of the Old Faithful Geyser in Yellowstone National Park.
day day of the eruption
interval time interval (in minutes) since last eruption
duration length (in minutes) of the eruption
sampleuserMay 25, 2007959B1000
Oil changes
Data for oil changes from the 2nd edition of Interactive Statistics by Aliaga and Gunderson, page 240.
OilChangesNumber of oil changes per year
CostCost of repairs per year
sampleuserMay 25, 200776B81
Nursing homes
The data were collected by the Department of Health and Social Services of the state of New Mexico and covered 52 of the 60 licensed nursing facilities in New Mexico in 1988.
BEDnumber of beds in home
RURALIndicator variable for location: rural (1) and non-rural (0)
sampleuserMay 25, 2007293B550
Mammals sleeping time
Measurements made for 62 mammals. Reference: Sleep in Mammals: Ecological and Constitutional Correlates, by Allison, T. and Cicchetti, D. (1976), Science, November 12, vol. 194, pp. 732-734. Missing values denoted by -999.0.
speciesspecies of animal
bodyweight(kg)body weight in kilograms
brainweight(g)brain weight in grams
nondreamingslow wave (nondreaming) sleep in hours per day
dreamingparadoxical (dreaming) sleep in hours per day
totaltotal sleep in hours per day (sum of slow wave and paradoxical sleep)
life_span(yrs)maximum life span in years
gestation(days)gestation time in days
predationpredation index (1-5)
1 = minimum (least likely to be preyed upon)
5 = maximum (most likely to be preyed upon)
exposuresleep exposure index (1-5)
1 = least exposed (e.g. animal sleeps in a well-protected den)
5 = most exposed
dangeroverall danger index (1-5)
(based on the above two indices and other information)
1 = least danger (from other animals)
5 = most danger (from other animals)
sampleuserMay 25, 20073KB362
Magazine ads
Thirty magazines were divided by educational level of their readers into three groups. Three magazines were randomly selected from each of the three groups. Six advertisements were randomly selected from each of the nine selected magazines. For each advertisement, the data below were observed.
WDSnumber of words in advertisement copy
SENnumber of sentences in advertising copy
3SYLnumber of 3+ syllable words in advertising copy
MAGmagazine (1 through 9 as shown below)
Group 1 Highest educational level:
1. Scientific American
2. Fortune
3. The New Yorker
Group 2 Medium educational level:
4. Sports Illustrated
5. Newsweek
6. People
Group 3 Lowest educational level:
7. National Enquirer
8. Grit
9 True Confessions
GRPeducational level of magazine (as above)
sampleuserMay 25, 2007714B741
Lottery data
This is a real data set consisting of 218 lottery values from September 3, 1989 to April 14, 1990 (32 weeks). One 3-digit random number (from 000 to 999) is drawn per day, 7 days per week for most weeks, but fewer days per week for some weeks.
numberthree digit number selected
sampleuserMay 25, 2007856B91
Leaf measurements
The following dataset was collected by a STAT 110 class in the Fall of 2001 at the University of South Carolina. The students split into 3 groups. Each group chose a different type of leaf and then selected a simple random sample of 20 leaves of that type. The height and width of each leaf was recorded in centimeters.
Gp1Hheight of group 1 leaves
Gp1Wwidth of group 1 leaves
Gp2Hheight of group 2 leaves
Gp2Wwidth of group 2 leaves
Gp3Hheight of group 3 leaves
Gp3Wwidth of group 3 leaves
sampleuserMay 25, 2007440B69
Labor force
Labor force participation rate of women for 19 cities and two years: 1968 and 1972.
City city name
1972 labor force participation rate of women in 1972
1968 labor force participation rate of women in 1968
sampleuserMay 25, 2007395B94
Iris data
This is a dataset made famous by Fisher, who used it to illustrate principles of discriminant analysis. Reference: Fisher, R. A. (1936). The Use of Multiple Measurements in Axonomic Problems. Annals of Eugenics 7, 179-188.
sepall Sepal length
sepalw Sepal width
petall Petal length
petalw Petal width
sampleuserMay 25, 20072KB126
Ice cream
Ice cream consumption was measured over 30 four-week periods from March 18, 1951 to July 11, 1953. The purpose of the study was to determine if ice cream consumption depends on the variables price, income, or temperature. Reference: Koteswara Rao Kadiyala (1970) Testing for the independence of regression disturbances. Econometrica, 38, 97-117.
month time period (1-30) of the study (from 3/18/51 to 7/11/53)
IC ice cream consumption in pints per capita
Price price of ice cream per pint in dollars
Income weekly family income in dollars
Temp mean temperature in degrees F
sampleuserMay 25, 2007642B623
Home runs
The number of home runs hit by Babe Ruth, Roger Maris, Mark McGwire and Sammy Sosa in each of their major league seasons (through the 1998-1999 season) where they played at least half of the games for the season.
RuthRuth home runs
MarisMaris home runs
McGwireMcGwire home runs
SosaSosa home runs
sampleuserMay 25, 2007226B168
Home prices in Albuquerque
The data are a random sample of 117 records of resales of homes from Feb 15 to Apr 30, 1993 from the files maintained by the Albuquerque Board of Realtors. This type of data is collected by multiple listing agencies in many cities and is used by realtors as an information base.
PRICESelling price in hundreds of dollars
SQFTSquare feet of living space
AGEAge of home in years
FEATSNumber out of 11 features (dishwasher, refrigerator, microwave, disposer, washer, intercom, skylight(s), compactor, dryer, handicap fit, cable TV access)
NELocated in northeast sector of city (1) or not (0)
CORCorner location (1) or not (0)
TAXAnnual taxes in dollars
sampleuserMay 25, 20073KB12657
Highway deaths
The number of United States and New Mexico highway fatalities per million vehicle-miles over 40 consecutive years (1945-1984).
NMNew Mexico fatalities
USUnited States fatalities
sampleuserMay 25, 2007327B348
Fortune billionaires in 1992
Fortune magazine publishes the list of billionaires annually. The 1992 list included 233 individuals or families. Their wealth, age and geographic location is reported.
wealth wealth of family or individual in billions of dollars
age age in years (for families it is the maximum age of family members)
region region of the World (Asia, Europe, Middle East, United States and Other)
sampleuserMay 25, 20073KB240
L.H.C. Tippett (1902-1985) was one of the pioneers in the field of statistical quality control, This data on the lengths of cuckoo eggs found in the nests of other birds (drawn from the work of O.M. Latter in 1902) is used by Tippett in his fundamental text. Cuckoos are knows to lay their eggs in the nests of other (host) birds. The eggs are then adopted and hatched by the host birds. The data are all in millimeters.
M_Pipit Meadow pipit lengths
T_Pipit Tree pipit lengths
Sparrow Hedge sparrow lengths
Robin Robin lengths
Wagtail Pied wagtail lengths
Wren Wren lengths
sampleuserMay 25, 20071KB384
CPS wage data from 1985
These data consist of a random sample of 534 persons from the Current Population Survey, with information on wages and other characteristics of the workers. Source: Berndt, ER. The Practice of Econometrics. 1991. NY: Addison-Wesley.
EducationNumber of years of education
SouthIndicator variable for Southern Region: (1=Person lives in South, 0=Person lives elsewhere)
SexIndicator variable for sex (1=Female, 0=Male)
ExperienceNumber of years of work experience
UnionIndicator variable for union membership (1=Union member,0=Not union member)
Wage Wage in dollars per hour
AgeAge in years
RaceCategorical variable for race (1=Other, 2=Hispanic, 3=White)
OccupationCategorical variable for occupation (1=Management,2=Sales, 3=Clerical, 4=Service, 5=Professional, 6=Other)
SectorCategorical variable for sector (0=Other, 1=Manufacturing, 2=Construction)
MarrIndicator variable for marital status (0=Unmarried, 1=Married)
sampleuserMay 25, 200714KB1354
Cereal data
Data on several different brands of cereal.
name Name of cereal
mfr Manufacturer of cereal where A = American Home Food Products; G = General Mills; K = Kelloggs; N = Nabisco; P = Post; Q = Quaker Oats; R = Ralston Purina type cold or hot
calories calories per serving
protein grams of protein
fat grams of fat
sodium milligrams of sodium
fiber grams of dietary fiber
carbo grams of complex carbohydrates
sugars grams of sugars
potass milligrams of potassium
vitamins vitamins and minerals - 0, 25, or 100, indicating the typical percentage of FDA recommended
shelf display shelf (1, 2, or 3, counting from the floor)
weight weight in ounces of one serving
cups number of cups in one serving
rating a rating of the cereals
sampleuserMay 25, 20075KB3240
Body measurements of sparrows
Body measurements of 48 female sparrows are shown. The data is from Multivariate Statistical Methods 2nd edition by Bryan F.J. Manly.
totaltotal length of the bird
beak/headlength of beak and head
humeruslength of humerus
sampleuserMay 25, 2007688B349


