StatCrunch logo (home)

StatCrunch ID:
Fitchburg State University
Public profile for anderson_instructor
Shared data sets   |   Shared results   |   Shared reports

Showing 1 to 29 of 29 data sets
Data Set/Description Owner Last edited Size Views
S19-Percent of 20 pennies before 1983
24 random samples, with replacement, of 20 pennies from PENNYAGES data set of 800 pennies.
anderson_instructorMar 18, 2019121B4
Fictitious data used in example report.
anderson_instructorFeb 26, 2019687B83
Results of anonymous survey given to students in a succession of semesters
anderson_instructorFeb 6, 20193KB515
Final gradebook for 2 F14 Applied Statistics classes, with student identification information removed.
anderson_instructorDec 24, 20182KB88
F18-AHA-FinalResults.xlsxanderson_instructorDec 17, 20181KB2
Year of 800 pennies from a local bank, sampled in 2011 (which is why frequency for 2011 is low).
anderson_instructorOct 29, 20184KB1440
10 highest peaks in Rockies and Appalachians
Is there more variation in the heights of the highest peaks in the Rockies (newer) or in the Appalachians (older)? Use coefficient of variation to find out!
anderson_instructorAug 18, 2018174B153
This data set contains data regarding gun ownership and gun deaths in various categories for 73 different countries. The data were obtained on 8/28/16 from Wikipedia. The Wikipedia pages have more information about the sources for the data values for each country and the dates on which the original data were collected. A. Variables obtained from Guns/100: total number of guns per 100 population B. Variables obtained from The dates on which data was obtained for the various countries range from 1995 to 2016. Country: name of country Total gun deaths/100,000: total number of gun deaths in one year per 100,000 population (sum of gun homicides/100,000, gun suicides/100,000, unintentional gun deaths/100,000, and undetermined gun deaths/100,000). Gun homicides/100,000: number of gun homicides in one year per 100,000 population. Includes justifiable gun homicides as well as unjustified gun homicides. Gun suicides/100,000: number of gun suicides in one year per 100,000 population. Unintentional gun deaths/100,000: number of unintentional gun deaths in one year per 100,000 population. Undetermined gun deaths/100,000: number of gun deaths in one year per 100,000 population that could not be categorized as homicide, suicide, or unintentional. C. Categorical variables with values calculated from the variables above: Relative guns per person higher – Guns/100 is greater than the median of 10.7 guns/100 population lower – Guns/100 is less than or equal to the median of 10.7 guns/100 population Relative total gun death rate higher – Total gun deaths/100,000 is greater than the median of 1.83 total gun deaths/100,000 population lower – Total gun deaths/100,000 is less than or equal to the median of 1.83 total gun deaths/100,000 population Relative gun homicide rate higher – Gun homicides/100,000 is greater than the median of 0.36 gun homicides/100,000 population lower – Gun homicides/100,000 is less than or equal to the median of 0.36 gun homicides/100,000 population Relative gun suicide rate higher – Gun suicides/100,000 is greater than the median of 0.81 gun suicides/100,000 population lower – Gun suicides/100,000 is less than or equal to the median of 0.81 gun suicides/100,000 population Relative unintentional gun death rate higher – Unintentional gun deaths/100,000 is greater than the median of 0.06 unintentional gun deaths/100,000 population lower - Unintentional gun deaths/100,000 is less than or equal to the median of 0.06 unintentional gun deaths/100,000 population
anderson_instructorSep 1, 20175KB3190
This data set describes the survival status of individual passengers on the Titanic. The principal source for data about Titanic passenger is Encyclopedia Titanica. The data set used here was begun by a variety of researchers. One of the original sources is Eaton & Haas (1994) Titanic: Triumph and Tragedy, Patrick Stephens Ltd, which includes a passenger list created by many researchers and edited by Michael A. Findlay.
anderson_instructorSep 1, 201766KB490
The data in this table were collected from some schools that award four-year degrees for use in comparing the student-to-teacher ratio between private colleges and public colleges. The data are for the 2004-2005 academic year; 85 private colleges and 57 state-supported (public) colleges were sampled. Each ratio was rounded to the nearest whole number for simplicity (from description on pp. 35-36 of Introductory Statistics: exploring the world through data) There is no assumption that the numbers of public and private schools in the sample are proportionate with the numbers in the population. The data are from the 2006 World Almanac and Book of Facts, and are assumed to have been copied into the data set by the textbook authors, Robert Gould and Colleen Ryan, or by the publisher.
anderson_instructorSep 1, 20178KB323
This data set includes data on 671 infants who had a very low birth weight, defined as being less than 1600 grams. The data were collected between 1981 and 1987 at Duke University Medical Center by Dr. Michael O'Shea, now of Bowman Gray Medical Center. This project was funded by a Clinical Epidemiology Grant from the Mellon Foundation. Note that ALL these infants had very low birth weights, so if you are comparing infants with a relatively high birth weight to infants with a relatively low birth weight, even the relatively high birth weights are very low compared to normal infant birth weights. Results of the study were published in M. O'Shea, D.A. Savitz, M.L. Hage, K.A. Feinstein: Prenatal events and the risk of subependymal / intraventricular haemorrhage in very low birth weight neonates. Paediatric and Perinatal Epdiemiology 1992;6:352-362.
anderson_instructorSep 1, 201743KB552
These are data on 501 patients having either acute viral or acute bacterial meningitis, from a study done by A. Spanos, F.E. Harrell, and D.T. Durack at Duke University Medical Center, published in Differential diagnosis of acute meningitis: An analysis of the predictive value of initial observations 1989, JAMA 262: 2700-2707.
anderson_instructorSep 1, 201729KB254
This data set contains information about 391 subjects who were interviewed and examined in a study to understand the “prevalence of obesity, diabetes, and other cardiovascular risk factors in central Virginia for African Americans.” (J.B. Schorling) Only those subjects whose glycosolated hemoglobin level was measured are included. The data were used in a study by J.P. Willems, J.T. Saunders, D.E. Hunt, and J.B. Schorling, published in “Prevalence of coronary heart disease risk factors among rural blacks: A community-based study”, Southern Medical Journal 90:814-820; 1997, and in another study by J.B. Schorling, J. Roach, M. Siegel, N. Baturka, D. E. Hunt, T. M. Guterbock, and H. L. Stewart, published in “A trial of church-based smoking cessation interventions for rural African Americans. Preventive Medicine 26:92-101; 1997.” The data were collected by some of the researchers for these studies.
anderson_instructorSep 1, 201736KB1268
This data set records data about 189 mothers of newborns, along with data about their infant. The goal of this study was to identify risk factors associated with giving birth to a low birth weight baby (weighing less than 2500 grams). The data were collected at Baystate Medical Center, Springfield, MA, in 1986. [quoted from description at You must include the following citation in your report as the source of the data, or you will be in violation of copyright laws: Hosmer and Lemeshow (2000) Applied Logistic Regression: Second Edition. These data are copyrighted by John Wiley & Sons Inc. and must be acknowledged and used accordingly. Data were collected at Baystate Medical Center, Springfield, Massachusetts during 1986.
anderson_instructorSep 1, 20178KB1135
These are data on 82 passenger vehicles, from a study done by R.M. Heavenrich, J.D. Murrell, and K.H. Hellman, and published in Light Duty Automotive Technology and Fuel Economy Trends Through 1991, U.S. Environmental Protection Agency, 1991 (EPA/AA/CTAB/91-02). Based on data provided by the U.S. Environmental Protection Agency.
anderson_instructorSep 1, 20176KB480
Steven Broad (, a mathematics faculty member at Saint Mary’s College in Notre Dame, IN, collected this data in 2012 after the mass shooting in Aurora, CO, to help inform discussions about gun control. He used the web site, which contains data about homicides from a number of countries, as his source of information to compile this data set for 38 countries. That web site references many other sources from which its data were obtained. There is no assumption that the 38 countries are selected at random, and says that gun-related information tends to be unreliable. Additional data on guns per 100 population in 2014 were added by Anne Anderson from
anderson_instructorAug 28, 20163KB572
Data on 312 patients with primary biliary cirrhosis, an autoimmune disease that slowly destroys the liver. Patients were randomly assigned to a treatment group that received the drug D-penicillamine or to a control group that received a placebo to see if D-penicillamine would increase their survival time or decrease the level of bilirubin (causes jaundice) in the blood. The liver normally controls the level of bilirubin, but a damaged liver is not able to maintain a healthy level of bilirubin in the blood. Cirrhosis of the liver can also be caused by alcoholism, but primary biliary cirrhosis is an autoimmune disease not caused by alcoholism. This data was published in Counting Processes & Survival Analysis, by T.R. Fleming and D.P. Harrington, 1991, New York: Wiley; Appendix D. The data were collected at the Mayo Clinic in Rochester, Minnesota.
anderson_instructorJan 28, 201523KB862
This data set describes some of the nutritional properties of 37 brands of hot dogs. The data set was collected by Craig Slinkman in 2011.
anderson_instructorJan 10, 20152KB145
This dataset is a random sample of 1000 seriously ill hospitalized patients from a famous study called “SUPPORT” (Study to Understand Prognoses Preferences Outcomes and Risks of Treatment). The original study included more than 10,000 patients from 5 U.S. hospitals. As the name suggests, the purpose of the study was to determine what factors affected or predicted outcomes, such as whether the patient died in the hospital or how long they remained in the hospital. The study was funded by the Robert Wood Johnson Foundation, and is still the largest study of end-of-life care available. It helped lead to the growth of hospice and palliative care for seriously ill hospitalized adults. The study ended in 1995. Many publications report results from this study, including the following: W.A. Knaus, F.E. Harrell, J Lynn, et al. (1995): The SUPPORT prognostic model: Objective estimates of survival for seriously ill hospitalized adults. Annals of Internal Medicine 122:191-203.
anderson_instructorJan 10, 201587KB194
This data set records information from 480 patients from the Worcester Heart Attack Community Surveillance Study; the subjects are patients who had suffered a heart attack (myocardial infarction) and were treated in one of the16 acute general hospitals in the Worcester, Massachusetts Standard Metropolitan Statistical Area. The patients included here are from the start of the study in December 1975 until 1988, although the study continued until 2006. Results based on this study have been published in many reports, including Goldberg RJ, Gore JM, Alpert JS, Dalen JE. Recent changes in attack and survival rates of acute myocardial infarction (1975 through 1981). The Worcester Heart Attack Study. JAMA. 1986 May 23-30;255(20):2774-9. The data were published in Hosmer D.W. and Lemeshow, S. (1998) Applied Survival Analysis:Regression Modeling of Time to Event Data, John Wiley and Sons Inc., New York, NY
anderson_instructorJan 9, 201519KB1329
The data consist of 200 subjects selected at random from a larger study done in 1988 on the survival of patients following admission to an adult intensive care unit (ICU). Data were collected at Baystate Medical Center in Springfield, MA. The purpose of the study was to develop a model to predict the probability of survival until hospital discharge and to study the risk factors associated with ICU mortality. Results have been published in a number of articles, one of which is by S. Lemeshow, D. Teres, J.S. Avrunin, and H. Pastides, “Predicting the Outcome of Intensive Care Unit Patients”, Journal of American Statistical Association, 83, 348-356 (1988). You must include the following citation in your report as the source of the data set or you will be in violation of copyright laws: Hosmer, D.W., Lemeshow, S. and Sturdivant, R.X. (2013), Applied Logistic Regression: Third Edition, Section 1.6.1, p. 23. These data are copyrighted by John Wiley & Sons Inc., and must be acknowledged and used accordingly.
anderson_instructorJan 9, 201510KB518
This data set records data related to respiratory function and smoking for 654 children ages 3 to 19. “Forced Expiratory Volume” (FEV) is the amount of air a person can exhale in the first second of a forceful breath, and indicates the person’s level of respiratory function. The data is a subset of data collected by I. Tager, S. Weiss, A. Munoz, B. Rosner, and F. Speizer, and published in • Tager, I., Weiss, S., Munoz, A., Rosner, B., and Speizer, F. (1983), “Longitudinal Study of the Effects of Maternal Smoking on Pulmonary Function,” New England Journal of Medicine, 309(12), 699-703. • Tager, I., Weiss, S., Rosner, B., and Speizer, F. (1979), "Effect of Parental Cigarette Smoking on the Pulmonary Function of Children," American Journal of Epidemiology, 110(1), 15-26. These studies were among the first to show clear evidence of the impact of smoking and exposure to second-hand smoke on respiratory health in children. One of the authors, Bernard Rosner, included this particular subset of the data in his book (1999), Fundamentals of Biostatistics, 5th ed., Pacific Grove, CA: Duxbury.
anderson_instructorJan 9, 201531KB125
This data was collected by Dr. Waldon Garris, of the University of Virginia School of Medicine, while he was working in the Dominican Republic in 1997. The subjects are people who visited medical clinic in several villages.
anderson_instructorJan 9, 201511KB66
This dataset is from the Duke University Cardiovascular Disease Databank and consists of 3501 patients who were referred to Duke University Medical Center for chest pain.
anderson_instructorJan 9, 2015199KB112
This data set contains data collected on 113 patients who were treated for shock by the Shock Research Unit at the University of Southern California, Los Angeles, California. The data set used here contains only the measurements made at the time the patient was admitted. The data set was published in A.A. Afifi, S.P. Azen, Statistical Analysis: A Computer Oriented Approach, 2nd edition, 1979, Academic Press, New York.
anderson_instructorJan 1, 20158KB192
This data set contains data on 502 male patients with prostate cancer. There data were collected by D.P. Byar and S.B. Green in 1980 and published in Bulletin Cancer, Paris 67:477-488.
anderson_instructorJan 1, 201568KB91
Female relatives of boys with Duchenne Muscular Dystrophy (DMD) are at risk of having a child with DMD. Today, it is relatively easy to do genetic testing for this gene, but 20 years ago, such testing was extremely expensive, and researchers were looking for some inexpensive test that would at least indicate whether a woman was at higher than normal risk, which would justify paying for the expensive genetic test. This dataset contains data on 208 women who are female relatives of boys with Duchenne Muscular Dystrophy. The researchers were trying to develop a screening test based on enzymes in blood serum. They measured 4 enzymes: creatine kinase and hemopexin are inexpensive to obtain, while the last two, pyruvate kinase and lactate dehydroginase are more expensive, although still much cheaper than genetic testing was at the time. Then the expensive genetic test was done to determine whether the woman was actually a carrier of the DMD gene.
anderson_instructorDec 29, 20146KB116
This data set contains facts about 77 companies selected from the Forbes 500 list for 1986. This is a 1/10 systematic sample from the alphabetical list of companies. The Forbes 500 includes all companies in the top 500 on any of the criteria, and thus has almost 800 companies in the list.
anderson_instructorDec 29, 20147KB53
This is data on 77 common breakfast cereals, collected from the nutrition labels on cereals at a large grocery store.
anderson_instructorDec 28, 20149KB258


Always Learning