StatCrunch logo (home)

Report Properties
Thumbnail:
Owner: mworkm86
Created: Mar 3, 2013
Share: yes
Views: 2403
Tags:
 
Results in this report
 
Data sets in this report
 
Need help?
To copy selected text, right click to Copy or choose the Copy option under your browser's Edit menu. Text copied in this manner can be pasted directly into most documents with formatting maintained.
To copy selected graphs, right click on the graph to Copy. When pasting into a document, make sure to paste the graph content rather than a link to the graph. For example, to paste in MS Word choose Edit > Paste Special, and select the Device Independent Bitmap option.
You can now also Mail results and reports. The email may contain a simple link to the StatCrunch site or the complete output with data and graphics attached. In addition to being a great way to deliver output to someone else, this is also a great way to save your own hard copy. To try it out, simply click on the Mail link.
Zestimate vs Final Selling Prices of Single Family Homes in Bolingbrook, IL/ Meagan Workman
Mail   Print   Twitter   Facebook

 

This report examines the final selling prices of homes versus the “Zestimate” sales prices of 4 BR single family homes in the Bolingbrook, IL area that were sold between December 12, 2012 and January 15, 2013. The data shown below was extracted from www.zillow.com and according to Zillow, January 15th was the most recent sale of a single family home in the Bolingbrook area. The following data includes two outliers, which affected the linear regression results drastically.

Data set 1. Bolingbrook Single-Family Homes Sold Price Vs. Zes   [Info]
To analyze this data, please sign in.

Simple linear regression results of the price sold vs the Zestimate shows a linear correlation coefficient of -0.0636, with the absolute value being 0.0636. This is less than the critical value of 0.423 for a sample size of 22, so according to these numbers, no linear relation exists between the two variables. The coefficient of determination (R2) for the Zestimate price is 0.004045041, so only .04% of the variability of selling price can be explained by the linear relation between the Zestimate price and the final selling price, which is expected since it has already been concluded that no linear relation exists. As stated above, this was due to the presence of outliers. It did not make sense to interterpret the y-intercept since a value of x=0 did not make sense.

Result 1: Simple Linear Regression Selling Price Vs. Zestimate   [Info]
Simple linear regression results:
Dependent Variable: Price Sold
Independent Variable: Zestimate
Price Sold = 293511.7 - 0.18534933 Zestimate
Sample size: 22
R (correlation coefficient) = -0.0636
R-sq = 0.004045041
Estimate of error standard deviation: 144607.61

Parameter estimates:
Parameter Estimate Std. Err. Alternative DF T-Stat P-Value
Intercept 293511.7 155140.75 ≠ 0 20 1.8919059 0.0731
Slope -0.18534933 0.65033096 ≠ 0 20 -0.28500772 0.7786


Analysis of variance table for regression model:
Source DF SS MS F-stat P-value
Model 1 1.69861709E9 1.69861709E9 0.081229396 0.7786
Error 20 4.18227192E11 2.091136E10
Total 21 4.19925819E11


Residuals stored in new column, Residuals.

The scatter plot of Zestimate vs selling price, however, shows a positive linear correlation with apparent outliers, so this may lead one to believe that it is because of these outliers that the results show no linear correlation. The boxplot of Zestimate and selling price confirms the presence of outliers, the most noticeable of which is a home that sold for $840,000 and its Zestimate was only $148,904! Data entry error? One would think so...

Result 2: Scatter Plot Bolingbrook Single Family Homes Zestimate vs, Final Selling Price   [Info]
Right click to copy

Result 3: Boxplot Bolingbrook Zillow Data Zestimate and Final Selling Price   [Info]
Right click to copy

The following data set is the original data set minus the one major outlier, where the selling price was $840,000 and the Zestimate was $148,904.

Data set 2. Bolingbrook Single-Family Homes Sold Price Vs. Zes   [Info]
To analyze this data, please sign in.

Simple linear regression results of the price sold vs the Zestimate shows a linear correlation coefficient of 0.8981, which is greater than the critical value of 0.433 for a sample size of 21, so according to these numbers, a positive linear relation exists between the two variables. The coefficient of determination (R2) for the Zestimate price is 0.80655575, so 80.7% of the variability of selling price can be explained by the linear relation between the Zestimate price and the final selling price; a BIG difference compared to the first set of results with the outlier! The scatter plot and least squares regression line show a discernable pattern of positive correlation between the Zestimate and Price Sold, as would be expected from the results mentioned above.

Result 4: Simple Linear Regression Zestimate vs Price Sold #2   [Info]
Simple linear regression results:
Dependent Variable: Price Sold
Independent Variable: Zestimate
Price Sold = -23693.863 + 1.0333966 Zestimate
Sample size: 21
R (correlation coefficient) = 0.8981
R-sq = 0.80655575
Estimate of error standard deviation: 23764.312

Parameter estimates:
Parameter Estimate Std. Err. Alternative DF T-Stat P-Value
Intercept -23693.863 28097.266 ≠ 0 19 -0.84328 0.4096
Slope 1.0333966 0.11610501 ≠ 0 19 8.900535 <0.0001


Analysis of variance table for regression model:
Source DF SS MS F-stat P-value
Model 1 4.473863E10 4.473863E10 79.21951 <0.0001
Error 19 1.07301079E10 5.6474253E8
Total 20 5.5468737E10


Residuals stored in new column, Residuals.

Result 5: Scatter Plot Zestimate vs Price Sold 2   [Info]
Right click to copy

Result 6: Simple Linear Regression Fitted Line Zestimate vs Price sold 2   [Info]
Right click to copy

The scatter plot of Zestimate residuals had no discernable pattern once the major outlier was removed, further confirming the presence of a linear relation.

Result 7: Scatter Plot Zestimate vs Residuals   [Info]
Right click to copy

The data included in this report and resulting statistics derived from this data show just how greatly one or two outliers can affect the entire linear model. While the ""Zestimate" seems to work fairly well in a linear model when no outliers are present, in today's market, with the presence of short sale, foreclosure, and bank-owned properties, the Zestimate should only be used as a tool for evaluating the current market, and should not be relied upon too heavily.

 

 

 

 

 

 

 

HTML link:
<A href="https://www.statcrunch.com/5.0/viewreport.php?reportid=30234">Zestimate vs Final Selling Prices of Single Family Homes in Bolingbrook, IL/ Meagan Workman</A>

Comments
Want to comment? Subscribe
Already a member? Sign in.
By msullivan13803
Mar 11, 2013

Nicely done. Way to pick up on the likely data entry error.
By mworkm86
Mar 3, 2013

Thanks, Miranda and Allison! Miranda, I have no idea what that house was supposed to say, because there ARE no million dollar houses in Bolingbrook, and that one CLEARLY isn't! Allison, the Zestimate was for $140s, and it says it SOLD for $840k... not very likely around here! Terrible about those people w/the foreclosure, but apparently it happens quite often. As you said, it's sad! Thanks for the comments, ladies!
Meagan
By msoren77
Mar 3, 2013

What a great report, Meagen! While reading your report I wondered what the correlation would be without the outlier and there you have it! I cannot believe the Zestimate for the outlier home was so off. I, too, assume it must be an error in data entry. Maybe it's supposed to be $1.48 million, instead? Your report is a great illustration of how significantly outliers (and more specifically influential data points) can greatly affect the correlation. Way to go!- Miranda Sorensen
By ychick27
Mar 3, 2013

HELLO MEAGEN.
What a fantastic report! I only wish that I was in the market when the home for $840k sold for $145K! Wow! I am so interested to know if this was an error or a lucky purchase! I know hom value are depreciating very quickly... but that is a steal! My friend has a home sell in her neighborhood for half of its value, The previous homeowners lost the home due to foreclosure. In anger, they turned and left the water on the last day they lived there. Consequently, a month later, a meter reader discovered the "watering hole" left from the previous homeowner! Sad! Thanka for sharing Allison Lawson

Always Learning