StatCrunch logo (home)

Data sets shared by StatCrunch members
Showing 1 to 15 of 65 data sets matching factor
Data Set/Description Owner Last edited Size Views
Top Rated Jobs 2014
This data is gathered from careercast.com and is available in it's original form at the source listed above. The dataset originally was created by Keisha Brown from Georgia Perimeter College.

ColumnDescription
Ranking Ranking from 0 to 200 based on the combined “Overall Rating”
JobTitle for the job.
Median Annual IncomeBased on Bureau of Labor Statistics
Overall RatingCombined rating based on income, stress, hiring outlook, and work environment. The lower the rating the better rated the job.
Stress RatingA rating from 1 to 200 estimating the overall stress level from the job. This essentially is a ranking with 1 being the least stressful job and 200 being the most stressful job.
Hiring Outlook Rating A rating from 1 to 200 estimating the overall stress level from the job. This essentially is a ranking with 1 being the best hiring outlook and 200 being the worst hiring outlook.
Work Environment Rating A rating from 1 to 200 estimating the overall stress level from the job. This essentially is a ranking with 1 being the best work environment and 200 being the worst work environment.
statcrunchhelpMar 14, 20169KB2986
Guns Ownership and Deaths by Firearms by State

The composite of two datasets from StateMaster.com, this dataset shows the percentage of survey respondents* who indicated that there is a firearm in the home [http://www.washingtonpost.com/wp-srv/health/interactives/guns/ownership.html] and the number of deaths by firearms per 100,000 population (most recent) [http://www.statemaster.com/graph/cri_mur_wit_fir-death-rate-per-100-000].

*In 2001 the Behavioral Risk Factor Surveillance System (BRFSS) in North Carolina surveyed 201,881 respondents nationwide, asking them, "Are any firearms now kept in or around your home? Include those kept in a garage, outdoor storage area, car, truck, or other motor vehicle."

mshelly33702Aug 29, 20102KB4281
Diamond Ring Prices.xls
The source of the data is a full page advertisement placed in the Straits Times newspaper issue of February 29, 1992, by a Singapore-based retailer of diamond jewelry. The advertisement contained pictures of diamond rings and listed their prices, diamond content, and gold purity. Only 20K ladies' rings, each mounted with a single diamond stone, were considered for this study. 20K rings are made with gold of 20 carat purity. (Pure gold is rated as 24K.) There were 48 such rings of varying designs. The weights of the diamond stones ranged from 0.12 to 0.35 carats (a one carat diamond stone weighs 0.2 gram) and were priced between $223 and $1086. The jewelry store adopted a fixed-price policy. How Is Jewelry Priced? In Singapore, the pricing of gold jewelry is simple. The price equals the current market value of the gold content (i.e., weight times the going rate per gram of gold) plus a craftsmanship fee. However, the pricing of other jewelry like diamond rings is more complicated because they are not as standardized as gold jewelry. The price of diamond jewelry depends on the four C's: caratage, cut, colour, and clarity of the diamond stone. A good cut gives a diamond more sparkle. Colourless diamonds are the most prized. A flawless diamond has maximum clarity because the passage of light is unimpeded through the stone. Cut, colour, and clarity are subjective factors and are very hard for the layman to gauge.
craig_slinkmanApr 22, 2010586B2842
Seating Choice versus GPA (For 3 rows, with Text and Indicator Columns)
This dataset contains hypothetical (I believe) data on GPA for students who sit in the front, middle, and back rows of a classroom, as well as a hypothetical gender variable. The data are shown using both text variables (e.g., "front" and "middle") and 0/1 indicator variables for the row and gender variables. This dataset is useful for demonstrating the different ways that StatCrunch can compare means based on two factors: (a) the text factor columns can be used in a two-way ANOVA; and (b) the 0/1 indicator columns can be used in multiple regression. (Because of StatCrunch's current limitation on equal cells, the 0/1 variables only use the first and middle rows.) Both procedures gives the same p-value and same conclusion (as long as the interaction term is centered), thus highlighting the similarity of statistical procedures and StatCrunch's flexibility.
bartonpoulsonApr 7, 20101KB5738
Final Stats Project
I set out to see if age or gender were factors in the average amount of exercise people participated in per week.
haleymcsweeneyMay 15, 2010195B1392
diamonds.csv
This is a very large data set showing various factors of over 50,000 diamonds including price, cut, color, clarity, etc. price: price in US dollars ($326–$18,823) carat: weight of the diamond (0.2–5.01) cut: quality of the cut (Fair, Good, Very Good, Premium, Ideal) color: diamond colour, from J (worst) to D (best) clarity: a measurement of how clear the diamond is (I1 (worst), SI1, SI2, VS1, VS2, VVS1, VVS2, IF (best)) x: length in mm (0–10.74) y: width in mm (0–58.9) z: depth in mm (0–31.8) depth: total depth percentage = z / mean(x, y) = 2 * z / (x + y) (43–79) table: width of top of diamond relative to widest point (43–95)
hbarker2Feb 19, 20163MB2189
Low Birth Weight Study
SOURCE: Hosmer and Lemeshow (2000) Applied Logistic Regression: Second Edition Data were collected at Baystate Medical Center, Springfield, Massachusetts during 1986. DESCRIPTIVE ABSTRACT: The goal of this study was to identify risk factors associated with giving birth to a low birth weight baby (weighing less than 2500 grams). Data were collected on 189 women, 59 of which had low birth weight babies and 130 of which had normal birth weight babies. Four variables which were thought to be of importance were age, weight of the subject at her last menstrual period, race, and the number of physician visits during the first trimester of pregnancy. LIST OF VARIABLES: Columns Variable Abbreviation ----------------------------------------------------------------------------- 2-4 Identification Code ID 10 Low Birth Weight (0 = Birth Weight >= 2500g, LOW 1 = Birth Weight < 2500g) 17-18 Age of the Mother in Years AGE 23-25 Weight in Pounds at the Last Menstrual Period LWT 32 Race (1 = White, 2 = Black, 3 = Other) RACE 40 Smoking Status During Pregnancy (1 = Yes, 0 = No) SMOKE 48 History of Premature Labor (0 = None 1 = One, etc.) PTL 55 History of Hypertension (1 = Yes, 0 = No) HT 61 Presence of Uterine Irritability (1 = Yes, 0 = No) UI 67 Number of Physician Visits During the First Trimester FTV (0 = None, 1 = One, 2 = Two, etc.) 73-76 Birth Weight in Grams BWT ----------------------------------------------------------------------------- PEDAGOGICAL NOTES: These data have been used as an example of fitting a multiple logistic regression model. STORY BEHIND THE DATA: Low birth weight is an outcome that has been of concern to physicians for years. This is due to the fact that infant mortality rates and birth defect rates are very high for low birth weight babies. A woman's behavior during pregnancy (including diet, smoking habits, and receiving prenatal care) can greatly alter the chances of carrying the baby to term and, consequently, of delivering a baby of normal birth weight. The variables identified in the code sheet given in the table have been shown to be associated with low birth weight in the obstetrical literature. The goal of the current study was to ascertain if these variables were important in the population being served by the medical center where the data were collected. References: 1. Hosmer and Lemeshow, Applied Logistic Regression, Wiley, (1989).
wikipetersonJul 23, 20126KB7829
Violent Crimes by State
http://www.census.gov/statab/ranks/rank21.html State Rankings -- Statistical Abstract of the United States VIOLENT CRIMES 1 PER 100,000 POPULATION -- 2006 [When states share the same rank, the next lower rank is omitted. Because of rounded data, states may have identical values shown, but different ranks. Cautionary note] Cautionary note about rankings The ranks in some tables are based on estimates derived from a sample(s). Because of sampling and nonsampling errors associated with the estimates, the ranking of the estimates does not necessarily reflect the correct ranking of the unknown true values. Thus, caution should be used when making inferences or statements about the states' true values based on a ranking of the estimates. As an example, the estimated total (average, percent, ratio, etc.) for State A may be larger than the estimates for all other states. This does not necessarily mean that the true total (average, percent, ratio, etc.) for State A is larger than those for all other states. Such an inference typically depends on --among other factors-- the size of the difference(s) between the estimates in question, and the size of their associated standard errors. In other tables, the ranks are based on a complete enumeration of the target population, or on complete administrative reporting from the population. In such cases, sampling is not used, and there is no sampling error component in the estimates. Still, care should still be taken when making inferences or statements based on the rankings. The table values may still exhibit nonsampling error originating from such sources as coverage problems (missing units or duplicates), nonresponse, misreporting, and others. Last Revised: September 27, 2011 at 09:43:17 AM
phil_larsonJan 16, 2013881B3446
D1.6
Dataset: airline_costs.dat Source: J.W. Proctor and J.S. Duncan (1954). "A Regression Analysis of Airline Costs," Journal of Air Law and Commerce, Vol.21, #3, pp.282-292. Description: Regression relating Operating Costs per revenue ton-mile to 7 factors: length of flight, speed of plane, daily flight time per aircraft, population served, ton-mile load factor, available tons per aircraft mile, and firms net assets. Regression based on natural logarithms of all factors, except load factor. Load factor and available tons (capacity) for Northeast Airlines was imputed from summary calculations. Variables/columns Airline 1-20 Length of flight (miles) 22-28 L_Group (inserted) Long (>175), Med (>60), Short (<69) Speed of Plane (miles per hour) 30-36 Daily Flight Time per plane (hours) 38-44 Population served (1000s) 46-52 Total Operating Cost (cents per revenue ton-mile) 54-60 Revenue Tons per Aircraft mile 62-68 Ton-Mile load factor (proportion) 70-76 Available Capacity (Tons per mile) 78-84 Total Assets ($100,000s) 86-92 Investments and Special Funds ($100,000s) 94-100 Adjusted Assets ($100,000s) 102-108
housew1Jul 3, 20192KB75
Diamonds
This is a very large data set showing various factors of over 50,000 diamonds including price, cut, color, clarity, etc. price: price in US dollars carat: weight of the diamond cut: quality of the cut (Fair, Good, Very Good, Premium, Ideal) color: diamond colour, from J (worst) to D (best) clarity: a measurement of how clear the diamond is (I1 (worst), SI1, SI2, VS1, VS2, VVS1, VVS2, IF (best)) x: length in mm, y: width in mm, z: depth in mm, depth: total depth percentage = z / mean(x, y) = 2 * z / (x + y), table: width of top of diamond relative to widest point
gjohnson151515Jan 12, 20173MB727
Toothless Residents in US by State
Adults aged 65+ who have had all their natural teeth extracted. NOTE: Data for Hawaii is not available. SOURCE: Centers for Disease Control and Prevention (CDC). Behavioral Risk Factor Surveillance System Survey Data. Atlanta, Georgia: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, 2004. Retrieved from StateMaster.com: Health Statistics > Oral health > Loss of natural teeth (most recent) by state.
bartonpoulsonMar 10, 20111KB550
Marriage vs The Economy
Comparing numbers of marriages in the last 30 years to the following factors of the economy: GDP Growth, Unemployment rate, Median Hourly Wages, and Total National Student Aid and Loans
sma25908Oct 24, 20181KB573
Gun Ownership by State (2001)
In 2001 the Behavioral Risk Factor Surveillance System (BRFSS) in North Carolina surveyed 201,881 respondents nationwide, asking them, "Are any firearms now kept in or around your home? Include those kept in a garage, outdoor storage area, car, truck, or other motor vehicle."
mshelly33702Aug 24, 20102KB436
Power Plant Emissions by State
The Carbon Dioxide Emissions Factor is measured in pounds of carbon dioxide emissions per megawatthour. The composition columns show the percentage of the state’s total generation in 2004 by source. Carbon is measured in tons per capita (not carbon dioxide). The data are based on production, not consumption.
cdcummings12May 13, 20091KB313
statistics.ods
I began this survey in order to pick out certain factors that may have been linked to students withdrawing from college courses, during the Fall 2010 semester at Ventura College; once the withdrawel deadline passed. Factors include gender, number of units, mode of transportation (whether public or personal), and employment status. My hypothesis was that students who fulfill one or more of the following factors, are more likely to withdraw from classes: students who are employed, rely on public transportation and are full time students. I collected the data for this research by randomly interviewing students from Ventura College.
vanessa.hofer978Jan 6, 20111KB363

1 2 3 4 5   >

Always Learning