StatCrunch logo (home)

StatCrunch ID:
gailllange
Occupation:
instructor
Organization:
UNIV OF MAINE-FARMINGTON
Public profile for gailllange
 
Shared data sets   |   Shared results   |   Shared reports

Showing 1 to 23 of 23 data sets
Data Set/Description Owner Last edited Size Views
TamhaneEx11_2.txt
We want to predict college GPA of matriculating freshmen based on their college entrance verbal and mathematics test scores. Data for a random sample of 40 graduating seniors on their college GPA along with their college entrance verbal test score and mathematics test score expressed as percentiles are shown below.
gailllangeDec 4, 20181.005B12
UMFNormalCommuteWeeklyMiles
Took random sample of 30 students; repeated 40 times. Assumed normal with mean = 150 miles and sigma = 25 miles. Weekly commute time in miles for commuter students.
gailllangeOct 8, 201822KB16
TamhaneOlympicEx105.txtgailllangeNov 16, 2016323B34
UMFDimensions.txt
A psychology grad student with a background in marketing was interested in the perception of consumers’ volume judgements of food packages. He noticed, for example, that the height of a container was a vital dimension that consumers appear to use as a simplifying visual heuristic to make a judgement about the volume of the container. As part of his project (the part you will work with), he wanted to know whether one could predict the maximum width of a food container based on its minimum width. He collected 27 typical food products (Skippy Peanut Butter, Coca Cola, Centrum Vitamins, for example). The dataset called UMFDimensions.txt has the variables we are interested in with names maxwidth and minwidth and the units for both are cm. You will build a linear regression model that would allow prediction of maxwidth of a food container from its minwidth.
gailllangeNov 14, 2016331B130
UMFAlps.txt
We wish to relate the barometric pressure (in inches of mercury) to the boiling point (in degrees Fahrenheit) of water in the Alps. We have a dataset called UMFAlps.txt and it has 17 rows. The variables are Pres and Temp. You are to build a linear regression model predicting Temp by Pres.
gailllangeNov 14, 2016267B51
UMFUnivGradRates.txt
Your dataset is called UMFUnivGradRates.txt. The variable names are medianSAT and GradRate. This data has 6-year graduation rate (%) and median SAT score for a random sample of the primarily undergraduate public universties and colleges in the United States with enrollments between 10,000 and 20,000. We took 15 rows of data. You are to investigare the relationship between graduation rate and the median SAT score. You are to build a linear regression model predicting graduation rate from the median SAT score.
gailllangeNov 14, 2016326B111
UMFSucrase.txt
Researchers measured the specific activity of the enzyme sucrase extracted from portions of the intestines of 24 patients who underwent an intestinal bypass. After the sections were extracted, they were homogenized and analyzed for enzyme activity. Two different methods can be used to measure the activity of sucrase: the homogenate method and the pellet method. The dataset UMFSucrase.txt has the your variables named Pellet and Homogenate. You are to build a linear regression model predicting Homogenate from Pellet.
gailllangeNov 14, 2016666B46
UMFTransport.txt
The management science staff of a grocery products manufacturer is developing a linear programming model for the production and distribution of its cereal products. The model requires transportation costs for a very large number of origins and destinations. It is impractical to do the detailed tariff analysis for every possible combination, so a sample of 48 routes is selected. For each route, the mileage and the shipping rate (in dollars per 100 pounds) are found. The dataset UMFTransport.txt has variables Mileage and Rate. You are to build a linear regression model predicting Rate from Mileage.
gailllangeNov 14, 2016673B85
UMFCarbon.txt
Your dataset is called UMFCarbon.txt. Carbon aerosols have been identified as a contributing factor in a number of air quality problems. A number of efforts have been made to develop rapid and simple analytical methods to determine the elemental, organic, and total carbon content of aerosols. These techniques include solvent extraction and optical methods. Out study compares these two methods. Diesel vehicle exhaust particulate samples were collected on filters. We present our dataset of 25 rows that gives the amount of elemental carbon measured. The “x” variable is the extractable mass of elemental carbon as determined by solvent extraction. We simply call x = mass and the units are μg/cm2. The “y” variable called elemCarbon and is the amount of elemental carbon as determined by the thermal optical carbon analyzer with units μg/cm2. You are to build a linear regression model predicting elemCarbon from mass.
gailllangeNov 14, 2016268B110
UMFShark.txt
Physical characteristics of sharks are of interest to surfers, scuba divers, and to marine researchers. We would like to see if it is possible to estimate jaw width from body length. Jaw width is hard to measure whereas body length is easy to measure. Our dataset is UMFShark.txt and has results from 44 sharks. The variables are BodyLength and JawWidth. We have shark body length (feet) and shark jaw width (inches). You are to build a linear regression model predicting shark JawWidth from shark BodyLength.
gailllangeNov 14, 2016957B190
UMFPeanut.txt
Peanuts are especially susceptible to a mold that produces a mycotoxin called aflatoxin. We are given the dataset UMFPeanut.txt which has 34 rows and has variables Aflatoxin and NCPeanuts. The first variable, Aflatoxin, is the average level of aflatoxin in a mini-lot sample of 120 pounds of peanuts, ppb. The unit ppb means parts per billion. The second variable, NCPeanuts, is the percentage of noncontaminated peanuts in the batch. You are to build a linear regression model predicting NCPeanuts from Aflatoxin.
gailllangeNov 14, 2016783B94
UMFEColi.txt
The Petrifilm HEC test is a new microbial method for detection of E. coli. Up until now the HGMF (Hydrophobic Grid Membrane Filtration) method was used. HGMF is an elaborate laboratory based procedure. HEC is easier and safer to use and it can be used in the field. Before using the HEC procedure, one has to compare readings from the HEC test to those from the HGMF test to obtain an equation that would relate the HEC reading to the HGMF reading. If the HEC test results were unrelated to the HGMF results, then the HEC procedure could not be used in the field. A study was conducted that would apply both procedures to artificially contaminated beef. Portions of beef trim were obtained from three Holstein cows that had tested negtative for E. coli. Eighteen portions of beef trim were obtained from the cows and then contaminated with E. coli. The HEC and HGMF procedures were applied to a portion of each of the 18 samples. The two procedures yielded E. coli concentrations as given in the dataset UMFEColi.txt. The variables are HEC and HGMF. The units are log10 CFU/ml. Note that CFU means colony forming units. You are to build a linear regression model predicting HEC from HGMF.
gailllangeNov 14, 2016294B50
UMFOldFaithful.txt
The time between eruptions of the Old Faithful geyser in Yellowstone National Park is random but is related to the duration of the last eruption. Refer to the dataset UMFOldFaithful.txt which has 21 rows. The variables are LAST (duration of the last eruption) and NEXT (time between eruptions). The time units of both are minutes. You are to build a linear regression model predicting NEXT from LAST.
gailllangeNov 14, 2016310B92
UMFGrowthpH.txt
Forest scientists are concerned with the decline in forest growth throughout the world. One aspect of this decline is the possible effect of emissions from coal-fired power plants. In particular, the scientists are interested in the pH level of the soil and the resulting impact on tree growth retardation. The scientists measure various aspects of growth associated with trees in a specified region and the soil pH in the same region. The scientists then want to determine impact on tree growth as the soil becomes more acidic. An index of growth retardation is constructed from various measurements taken on the trees with a high value indicating greater retardation in tree growth. A lower value of soil pH indicates a more acidic soil. Twenty tree stands which are exposed to a particular power plant emissions are selected for study. The dataset UMFGrowthpH.txt contains the variables SoilpH and GrowRet. You are to build a linear regression model predicting GrowRet from SoilpH.
gailllangeNov 14, 2016391B188
UMFTractors.txt
We want to study the relationship of the cost of the maintenance of shipping tractors with the age of the tractor. Cost is in dollars and age is in years. The dataset UMFTractors.txt contains 17 rows and the variables are Age and SixMonthCost. You are to build a linear regression model predicting SixMonthCost from Age.
gailllangeNov 14, 2016306B41
UMFSteam.txt
We have a dataset called UMFSteam.txt that has 25 observations. These observations were taken at intervals from a steam plant at a large industrial concern. The variables are Steam and AtmTemp. Here Steam is pounds of steam used monthly. The variable AtmTemp is the average atmospheric temperature during the month in degrees Fahrenheit. You are to build a linear regression model predicting Steam from AtmTemp.
gailllangeNov 14, 2016580B25
TamhaneOlympicEx10_5.txtgailllangeNov 13, 2016324B21
RegressionExample1gailllangeNov 4, 201649B415
GailDataMidrangeFranchise.txtgailllangeAug 17, 2016304B177
GailDataBudgetFranchise.txtgailllangeAug 17, 2016326B180
GailDataFirstClassFranchise.txtgailllangeAug 17, 2016184B151
GailDataAluminumContamination.txtgailllangeAug 17, 2016384B248
TamhaneEx4_12.txt
Daily rainfall in mm. over a 47 year period in Turramurra, Sydney, Australia. For each year, the day with the greatest rainfall was recorded.
gailllangeJul 27, 2016295B69

 

Always Learning