StatCrunch logo (home)

Data sets shared by StatCrunch members
Showing 1 to 15 of 89 data sets matching Sampling
Data Set/Description Owner Last edited Size Views
US Counties and Presidential Voting Dataset
Sampling Unit county 3141 observations and 19 variables, maximum # NAs:2956 Name county -- County state -- State msa -- Metropolitan Statistical Area pmsa -- Primary Metropolitan Statistical Area pop.density -- 1992 pop per 1990 miles^2 pop -- 1990 population pop.change -- Percent population change 1980-1992 age6574 -- Percent age 65-74, 1990 age75 -- Percent age >= 75, 1990 crime -- serious crimes per 100,000 1991 college -- Percent with bachelor's degree or higher of those age>=25 income -- median family income, 1989 dollars farm -- farm population, % of total, 1990 democrat -- Percent votes cast for democratic president republican -- Percent votes cast for republican president Perot -- Percent votes cast for Ross Perot white -- Percent white, 1990 black -- Percent black, 1990 turnout -- 1992 votes for president / 1990 pop x 100
craig_slinkmanApr 12, 2011755KB2469
Woodbury Sampling
76 student responses to ... 1) Are you a smoker? 2) Do you own an iPhone? 3) How much did you spend on books and supplies for your courses this semester?
georgew49Aug 15, 20121.024B2029
Violent Crimes by State State Rankings -- Statistical Abstract of the United States VIOLENT CRIMES 1 PER 100,000 POPULATION -- 2006 [When states share the same rank, the next lower rank is omitted. Because of rounded data, states may have identical values shown, but different ranks. Cautionary note] Cautionary note about rankings The ranks in some tables are based on estimates derived from a sample(s). Because of sampling and nonsampling errors associated with the estimates, the ranking of the estimates does not necessarily reflect the correct ranking of the unknown true values. Thus, caution should be used when making inferences or statements about the states' true values based on a ranking of the estimates. As an example, the estimated total (average, percent, ratio, etc.) for State A may be larger than the estimates for all other states. This does not necessarily mean that the true total (average, percent, ratio, etc.) for State A is larger than those for all other states. Such an inference typically depends on --among other factors-- the size of the difference(s) between the estimates in question, and the size of their associated standard errors. In other tables, the ranks are based on a complete enumeration of the target population, or on complete administrative reporting from the population. In such cases, sampling is not used, and there is no sampling error component in the estimates. Still, care should still be taken when making inferences or statements based on the rankings. The table values may still exhibit nonsampling error originating from such sources as coverage problems (missing units or duplicates), nonresponse, misreporting, and others. Last Revised: September 27, 2011 at 09:43:17 AM
phil_larsonJan 16, 2013881B3469
Annual Movie Data 2008 Random Sampling.txt
This data is a random sampling of movies that played in theaters in 2008. It includes movies released in previous years that earned money during 2008. For example, a movie released over Thanksgiving in 2007 will most likely earn money in 2007 and 2008. Each box office year ends on the first Sunday of the following year. The next year starts the following day (Monday). For example, the "2004 box office year" ended on Sunday, January 2, 2005. Inflation-adjusted figures are based ticket sale estimates, and may not be precise due to rounding errors.
wikipetersonOct 7, 20098KB512
Arlington Gasoline Retailers Sampling Frame.xls
This is a sampling frame of all gasoline retailer in Arlington Texas collected in Spring Spemster of 2010. Note that you may need to drag the column lines in order to see the entire data fields.
craig_slinkmanApr 8, 201022KB491
Annual Movie Data 2008 Random Sampling.txt
This chart ranks movies by the amount they earned during 2008.
wikipetersonOct 14, 20096KB436
Maria's Data Analysis
Maria's Classroom data to be used for the Sampling Variability Project
ninibb1Jun 14, 20162KB369
UTA Cola
This data set consists of 10,000 observations. The variable of interest is the actual number of fluid ounces in a 16 ounce bottle of UTA Cola. The population mean is 16 and the standard deviation is 0.1. This data is useful for demonstrating the concept of random sampling, sampling distributions, confidence intervals, and hypothesis tests.
craig_slinkmanMar 31, 2011137KB238
Dividing City into blocks
numbers shown represent house address​ numbers
niarah.brown0116Jun 1, 2019871B43
Sampling Senators 115th
Name, State, Affiliation, & Age of members of the 115th Congress
pmontegaryMar 27, 20193KB288
Simple Random Sample of n=30
Using the Age column, take a Simple Random Sample (SRS) of n=30. Show the sample and explain how you took the sample. Use StatCrunch sampling, Excel Analysis Toolpak sampling, or Excel function Randbetween (using row numbers). Taking the first 30, or every 5th row, or other such schemes are NOT random.
lethamfrancisMar 31, 2014631B404
BSTAT 3321 Final Averages
Used to random sampling and the concept of sampling error.
craig_slinkmanMar 19, 2011270B144
Have COC Students Decided Their Major?
Data collected by method of convenience sampling.
aldusdeanOct 22, 2017289B68
This table lists the number of wins from playing the Let's Make a Deal applet 50 times with the strategy "stay with door 1 no matter what." These data were generated during Lab 2 (i.e., 1 entry per group) in STAT 215 at WVU and will be reused later in this course to illustrate sampling distribution concepts.
kjryanSep 6, 2017110B50
Wolf River Pollution
Jaffe, Parker and Wilson (1982) have investigated the concentration of several hydrophobic organic substances (such as hexachlorobenzene, chlordane, heptachlor, aldrin, dieldrin, endrin) in the Wolf River in Tennessee. Measurements were taken downstream of an abandoned dump site that had previously been used by the pesticide industry to dispose of its waste products. It was expected that these hydrophic substances might have a nonhomogeneous vertical distribution in the river because of differences in density between these compounds and water and because of the adsorption of these compounds on sediments, which could lead to higher concentrations on the bottom. It is important to check this hypothesis because the standard procedure of sampling at six-tenths of the depth could miss the bulk of these pollutants if the distribution were not uniform. Grab samples were taken with a La Motte-Vandorn water sampler of 1 litre capacity at various depths of the river. This sampler consists of a horizontal plexiglas tube of 7 centimetres diameter and a plunger of each side which shuts the sampler when the sampler is at the desired depth. Ten surface, 10 mid-depth and 10 bottom samples were collected, all within a relatively short period. Until they were analysed the samples were stored in 1-quart mason jars at low temperature. In the analysis of the samples, a 250-millilitre water sample was taken from each mason jar and was extracted with 1 millilitre of either hexanes or petroleum ether. A sample of the extract was then injected into a gas chromatograph and the output was compared against standards of known concentrations. The test procedure was repeated two more times, injecting different samples of the extract in the gas chromatograph. The average aldrin and hexachlorobenzene (HCB) concentrations (in nanograms per liter) in these 30 samples are given in the data.
jmantheyApr 17, 2014569B444

1 2 3 4 5 6   >

Always Learning