Data sets shared by StatCrunch members
Showing 1 to 15 of 198 data sets matching regression
Data on all residential home sales in Ames, Iowa between 2006 and 2010. The data set contains many explanatory variables on the quality and quantity of physical attributes of residential homes in Iowa sold between 2006 and 2010. Most of the variables describe information a typical home buyer would like to know about a property (square footage, number of bedrooms and bathrooms, size of lot, etc.). A detailed discussion of variables can be found in the original paper referenced below.
Source: De Cock D. 2011. Ames, Iowa: Alternative to the Boston Housing Data as an End of Semester Regression Project. Journal of Statistics Education; 19(3).
A data frame with 2930 observations and 82 variables. A description of all variables can be found at amstat.org.
|yes||lmcmath34||Mar 14, 2017||1MB||26|
Movie Budgets and Box Office Earnings (Updated Fall 2016)|
This data all comes from the following website the tracks the financial performance of movies:
The “Budget”, “Domestic Gross”, and “Worldwide Gross” columns each are in millions of dollars.
|yes||ntorno8||Mar 14, 2017||266KB||1925|
Diamond Data MAT123 Lab|
Data to be used for MAT 123 Regression Lab. h/t Heather Barker (email@example.com)
|firstname.lastname@example.org||Mar 1, 2017||3MB||55|
Lung Capacity - Multiple Regressionemail@example.com||Jan 13, 2017||490B||233|
Section E Cleaned Regression Data (cleaned again 11/30 10:57 PM)||yes||math.ralph||Nov 30, 2016||5KB||279|
Section N Cleaned Regression Data (cleaned 4th time, 11:30 AM 11/30)||yes||math.ralph||Nov 30, 2016||5KB||232|
stat regression greg danica firstname.lastname@example.org||Nov 20, 2016||11KB||159|
Cleansed Regression Data D (Fall 2016)||email@example.com||Nov 19, 2016||5KB||347|
Roller Coasters Data|
This dataset looks at some of the roller coasters across the US and various other countries.
|Name||Name of roller coaster|
|Park||Amusement park for roller coaster|
|City||City for amusement park|
|Country||Country of the roller coaster. US: United States, MX: Mexico, CR: Costa Rica, GT: Guatemala, CO: Columbia, VE: Venezuela, BR: Brazil, AR: Argentina, CL: Chile, EQ: Ecuador, PE: Peru, F: France, D: Germany|
|Type||S: Steel, W: Wood|
|Constructor||Type of build for the roller coaster|
|Height||Height in meters|
|Speed||Speed in miles per hour (mph)|
|Length||Length in meters|
|Inversions||Yes if there are inversions, no if not|
|Duration||Duration of ride in seconds|
|Opened||Year it opened|
|Region||Geographic region for the roller coaster||yes||ntorno8||Sep 15, 2016||48KB||11436|
MLB Home Attendance vs. Runs Scored 2015|
This data comes from the 2015 baseball season and tracks the number of home games, the total attendance at home games, the number of runs scored by that team, the runs scored on that team, the league they play in, and the number of wins the team recorded in the regular season.
|yes||frompearsonbooks||Jun 14, 2016||1KB||927|
Times World University Rankings (2011-2016)|
This data comes from the annual Times magazine rankings of universities across the world. The webpage for the Times 2016 rankings is listed above in the source.
The formula for the 2016 rankings is as follows:
30% for Teaching Rating
7.5% for International Outlook Rating
30% for Research Rating
30% for Citations Rating
2.5% for Industry Income Rating.
The “Total Score” from 2016 can be recreated using this formula.
|World_Rank||University rank for a given year
| University_Name||The name of the university
|Country||Location of university
|Teaching_Rating ||Rating from a 0-100 scale of the quality of teaching at the university. This rating is based on the institution’s reputation for teaching, it’s student/staff ratio, it’s PhD’s/ undergraduate degrees awarded ratio, and it’s institutional income/ academic staff ratio.
|Inter_Outlook_Rating ||Rating from a 0-100 scale of the international makeup of a university. This rating is based the international student percentage, international staff percentage, and the percentage of research papers from the university that include at least one international author.
|Research_Rating|| Rating from a 0-100 scale of quality of research at the university. This rating is based on the university’s reputation, it’s research income/ academic staff ratio, and it’s production of scholarly papers.
|Citations_Rating|| Rating from a 0-100 scale of based on the normalized average of citations by other papers per paper from the university (how often the research from the university is cited by other papers).
|Industry_Income_Rating|| Rating from a 0-100 scale grading how much companies are willing to invest in the universities research. The rating is calculated based on the research income from businesses per academic staff member.
|Total_Score||The final score used to determine the university ranking based on Teaching_Rating, International_Outlook_Rating, Research_Rating, Citations_Rating, and Industrial_Income_Rating.
|Num_Students||Total number of students in a given year
|Student/Staff_Ratio||Number of students per academic staff member
|%_Inter_Students||Percentage of student body who come from a foreign county
|%_Female_Students ||Percentage of student body that is female.
|Year||Academic year that the ranking was released. For example, 2016 denotes the 2015-2016 academic year.
||yes||statcrunchhelp||Apr 5, 2016||254KB||1590|
Top 100 Retailers 2015|
This dataset comes from the National Retail Federation and tracks the top retail chains in the US for 2015 based on their 2014 sales. The original data can be found at the webpage listed as the source. Note that these retailer include all sorts of avenues including internet sales.
|yes||statcrunchhelp||Mar 14, 2016||7KB||2246|
California Home Prices, 2009|
This dataset is a collection of real estate listings from San Luis Obispo county, California, and some locations around it from 2009. The prices are their list price at the creation of this dataset. For more information about this data, go to the website source listed above.
|yes||statcrunchhelp||Mar 11, 2016||46KB||1217|
National Longitudinal Youth Survey|
The Youth survey consists of a nationally representative sample of youths who were 14 to 20 years old as of December 31, 1999.
This dataset tracks the Age, Height (in inches), Weight (in pounds), Gender, and the self reported "How would you describe your weight?" multiple choice answers for the individuals.
|yes||statcrunchhelp||Mar 8, 2016||330KB||786|
Data taken from the Journal of Statistics Education online data archive. That archive in turn got the data from an article in the Journal of the American Medical Association. (Mackowiak, et al., "A Critical Appraisal of 98.6 Degrees F …", vol. 268, pp. 1578-80, 1992).
"Body Temp" is measured in degrees fahrenheit
"Heart rate" is the resting beats per minute
|yes||statcrunchhelp||Mar 8, 2016||2KB||2247|