
Public profile for gailllange
Shared data sets  Shared results  Shared reports
Showing 1 to 23 of 23 data sets
Data Set/Description 
Owner 
Last edited 
Size 
Views 
TamhaneEx11_2.txt
We want to predict college GPA of matriculating freshmen based on their
college entrance verbal and mathematics test scores. Data for a random sample of 40 graduating seniors on their college GPA along with their college entrance
verbal test score and mathematics test score expressed as percentiles are shown below.  gailllange  Dec 4, 2018  1.005B  12  UMFNormalCommuteWeeklyMiles
Took random sample of 30 students; repeated 40 times. Assumed normal with mean = 150 miles and
sigma = 25 miles. Weekly commute time in miles for commuter students.  gailllange  Oct 8, 2018  22KB  16  TamhaneOlympicEx105.txt  gailllange  Nov 16, 2016  323B  34  UMFDimensions.txt
A psychology grad student with a background in marketing was interested in the perception of consumers’ volume judgements of food packages. He noticed, for example, that the height of a container was a vital dimension that consumers appear to use as a simplifying visual heuristic to make a judgement about the volume of the container. As part of his project (the part you will work with), he wanted to know whether one could predict the maximum width of a food container based on its minimum width. He collected 27 typical food products (Skippy Peanut Butter, Coca Cola, Centrum Vitamins, for example). The dataset called UMFDimensions.txt has the variables we are interested in with names maxwidth and minwidth and the units for both are cm. You will build a linear regression model that would allow prediction of maxwidth of a food container from its minwidth.  gailllange  Nov 14, 2016  331B  130  UMFAlps.txt
We wish to relate the barometric pressure (in inches of mercury) to the boiling point (in degrees Fahrenheit) of water in the Alps. We have a dataset called UMFAlps.txt and it has 17 rows. The variables are Pres and Temp. You are to build a linear regression model predicting Temp by Pres.  gailllange  Nov 14, 2016  267B  51  UMFUnivGradRates.txt
Your dataset is called UMFUnivGradRates.txt. The variable names are medianSAT and
GradRate. This data has 6year graduation rate (%) and median SAT score for a random sample of the primarily undergraduate public universties and colleges in the United States with enrollments between 10,000 and 20,000. We took 15 rows of data. You are to investigare the relationship between graduation rate and the median SAT score. You are to build a linear regression model predicting graduation rate from the median SAT score.  gailllange  Nov 14, 2016  326B  111  UMFSucrase.txt
Researchers measured the specific activity of the enzyme sucrase extracted from portions
of the intestines of 24 patients who underwent an intestinal bypass. After the sections were extracted, they were homogenized and analyzed for enzyme activity. Two different methods can be used to measure the activity of sucrase: the homogenate method and the pellet method. The dataset UMFSucrase.txt has the your variables named Pellet and Homogenate. You are to build a linear regression model predicting Homogenate
from Pellet.  gailllange  Nov 14, 2016  666B  46  UMFTransport.txt
The management science staff of a grocery products manufacturer is developing a linear programming model for the production and distribution of its cereal products. The model requires transportation costs for a very large number of origins and destinations. It is impractical to do the detailed tariff analysis for every possible combination, so a sample of 48 routes is selected. For each route, the mileage and the shipping rate (in dollars per 100 pounds) are found. The dataset UMFTransport.txt has variables Mileage and Rate. You are to build a linear regression model predicting Rate from Mileage.  gailllange  Nov 14, 2016  673B  85  UMFCarbon.txt
Your dataset is called UMFCarbon.txt. Carbon aerosols have been identified as a contributing factor in a number of air quality problems. A number of efforts have been made to develop rapid and simple analytical methods to determine the elemental, organic, and total carbon content of aerosols. These techniques include solvent extraction and optical methods. Out study compares these two methods. Diesel
vehicle exhaust particulate samples were collected on filters. We present our dataset of 25 rows that gives the amount of elemental carbon measured. The “x” variable is the extractable mass of elemental carbon as determined by solvent extraction. We simply call x = mass and the units are μg/cm2. The “y” variable called elemCarbon and is the amount of elemental carbon as determined by the thermal optical carbon
analyzer with units μg/cm2. You are to build a linear regression model predicting elemCarbon from mass.  gailllange  Nov 14, 2016  268B  110  UMFShark.txt
Physical characteristics of sharks are of interest to surfers, scuba divers, and to marine researchers. We would like to see if it is possible to estimate jaw width from body length. Jaw width is hard to measure whereas body length is easy to measure. Our dataset is UMFShark.txt and has results from 44 sharks. The variables are BodyLength and JawWidth. We have shark body length (feet) and shark jaw width (inches). You are to build a linear regression model predicting shark JawWidth from shark BodyLength.  gailllange  Nov 14, 2016  957B  190  UMFPeanut.txt
Peanuts are especially susceptible to a mold that produces a mycotoxin called aflatoxin. We are given the dataset UMFPeanut.txt which has 34 rows and has variables Aflatoxin and NCPeanuts. The first variable, Aflatoxin, is the average level of aflatoxin in a minilot sample of 120 pounds of peanuts, ppb. The unit ppb means parts per billion. The second variable, NCPeanuts, is the percentage of noncontaminated peanuts in the batch. You are to build a linear regression model predicting NCPeanuts from Aflatoxin.  gailllange  Nov 14, 2016  783B  94  UMFEColi.txt
The Petrifilm HEC test is a new microbial method for detection of E. coli. Up until now the HGMF (Hydrophobic Grid Membrane Filtration) method was used. HGMF is an elaborate laboratory based procedure. HEC is easier and safer to use and it can be used in the field. Before using the HEC procedure, one has to compare readings from the HEC test to those from the HGMF test to obtain an equation that would relate the HEC reading to the HGMF reading. If the HEC test results were unrelated to the HGMF results, then the HEC procedure could not be used in the field. A study was conducted
that would apply both procedures to artificially contaminated beef. Portions of beef trim were obtained from three Holstein cows that had tested negtative for E. coli. Eighteen portions of beef trim were obtained from the cows and then contaminated with E. coli. The HEC and HGMF procedures were applied to a portion of each of the 18 samples. The two procedures yielded E. coli concentrations as given in the dataset UMFEColi.txt. The variables are HEC and HGMF. The units are log10 CFU/ml. Note that CFU means colony forming units. You are to build a linear regression model predicting HEC from HGMF.  gailllange  Nov 14, 2016  294B  50  UMFOldFaithful.txt
The time between eruptions of the Old Faithful geyser in Yellowstone National Park is random but is related to the duration of the last eruption. Refer to the dataset UMFOldFaithful.txt which has 21 rows. The variables are LAST (duration of the last eruption) and NEXT (time between eruptions). The time units of both are minutes. You are to build a linear regression model predicting NEXT from LAST.  gailllange  Nov 14, 2016  310B  92  UMFGrowthpH.txt
Forest scientists are concerned with the decline in forest growth throughout the world. One aspect of this decline is the possible effect of emissions from coalfired power plants. In particular, the scientists are interested in the pH level of the soil and the resulting impact on tree growth retardation. The scientists measure various aspects of growth associated with trees in a specified region and the soil pH in the same region. The scientists then want to determine impact on tree growth as the soil becomes
more acidic. An index of growth retardation is constructed from various measurements taken on the trees with a high value indicating greater retardation in tree growth. A lower value of soil pH indicates a more acidic soil. Twenty tree stands which are exposed to a particular power plant emissions are selected for study. The dataset UMFGrowthpH.txt contains the variables SoilpH and GrowRet. You are to build a linear
regression model predicting GrowRet from SoilpH.  gailllange  Nov 14, 2016  391B  188  UMFTractors.txt
We want to study the relationship of the cost of the maintenance of shipping tractors with
the age of the tractor. Cost is in dollars and age is in years. The dataset UMFTractors.txt contains 17 rows and the variables are Age and SixMonthCost. You are to build a linear regression model predicting SixMonthCost from Age.  gailllange  Nov 14, 2016  306B  41  UMFSteam.txt
We have a dataset called UMFSteam.txt that has 25 observations. These observations were taken at intervals from a steam plant at a large industrial concern. The variables are Steam and AtmTemp. Here Steam is pounds of steam used monthly. The variable AtmTemp is the average atmospheric temperature during the month in degrees Fahrenheit. You are to build a linear regression model predicting Steam from AtmTemp.
 gailllange  Nov 14, 2016  580B  25  TamhaneOlympicEx10_5.txt  gailllange  Nov 13, 2016  324B  21  RegressionExample1  gailllange  Nov 4, 2016  49B  415  GailDataMidrangeFranchise.txt  gailllange  Aug 17, 2016  304B  177  GailDataBudgetFranchise.txt  gailllange  Aug 17, 2016  326B  180  GailDataFirstClassFranchise.txt  gailllange  Aug 17, 2016  184B  151  GailDataAluminumContamination.txt  gailllange  Aug 17, 2016  384B  248  TamhaneEx4_12.txt
Daily rainfall in mm. over a 47 year period in Turramurra, Sydney, Australia.
For each year, the day with the greatest rainfall was recorded.  gailllange  Jul 27, 2016  295B  69 

