StatCrunch logo (home)

Report Properties
Thumbnail:
Owner: ntorno8
Created: Mar 11, 2015
Share: yes
Views: 5905
 
Results in this report
 
Data sets in this report
 
Need help?
To copy selected text, right click to Copy or choose the Copy option under your browser's Edit menu. Text copied in this manner can be pasted directly into most documents with formatting maintained.
To copy selected graphs, right click on the graph to Copy. When pasting into a document, make sure to paste the graph content rather than a link to the graph. For example, to paste in MS Word choose Edit > Paste Special, and select the Device Independent Bitmap option.
You can now also Mail results and reports. The email may contain a simple link to the StatCrunch site or the complete output with data and graphics attached. In addition to being a great way to deliver output to someone else, this is also a great way to save your own hard copy. To try it out, simply click on the Mail link.
Transformations in Simple Linear Regression
Mail   Print   Twitter   Facebook

A new feature to be able to transform data has recently been added to the StatCrunch simple linear regression feature.  An example of how this feature can be useful will be demonstrated with a regression of how well an individual's Age predicts the number of Facebook friends for the survey data shown below.

Data set 1. Responses to Sullivan Statistics Survey   [Info]
To analyze this data, please sign in.

To begin, select the Stat > Regression > Simple Linear menu option.  In the resulting dialog window, select the Age column as the X variable, and the Facebook friends column as the Y variable.  Click Compute! to see the initial results shown below.

Result 1: Results before transformation   [Info]
Simple linear regression results:
Dependent Variable: Facebook friends
Independent Variable: Age
Facebook friends = 537.17278 - 8.4196831 Age
Sample size: 153
R (correlation coefficient) = -0.42868834
R-sq = 0.18377369
Estimate of error standard deviation: 203.56936

Parameter estimates:
Parameter Estimate Std. Err. Alternative DF T-Stat P-Value
Intercept 537.17278 50.530318 ≠ 0 151 10.630703 <0.0001
Slope -8.4196831 1.4440123 ≠ 0 151 -5.8307557 <0.0001

Analysis of variance table for regression model:
Source DF SS MS F-stat P-value
Model 1 1408881.6 1408881.6 33.997712 <0.0001
Error 151 6257512.9 41440.483
Total 152 7666394.5

For this regression the slope is negative with a p-value <0.0001 and a R-squared of 0.18377369.  This tells us that there is a negative correlation between an individual’s age and the number of Facebook friends.  Click on the arrow in the bottom right corner to see the scatterplot shown below.

Result 2: Graph before transformation   [Info]
Right click to copy

Notice how this scatterplot displays a “fan behavior” around the prediction:  at age 20 there is a wider range of the number of Facebook friends than at age 50 and age 60.  One way to correct this problem having unequal variance is to do a data transformation.  

Under the top left Options menu click Edit  to change the regression results.  Under the Transformation header select Natural logarithm: log(y) to depress the variance of the number of Facebook friends. Click Compute! to see the results shown below.

Result 3: Results after transformation   [Info]
Simple linear regression results (w/ transformation):
Dependent Variable: log(Facebook friends)
Independent Variable: Age
log(Facebook friends) = 6.8115594 - 0.052285775 Age
Sample size: 153
R (correlation coefficient) = -0.52926331
R-sq = 0.28011965
Estimate of error standard deviation: 0.96160109

Parameter estimates:
Parameter Estimate Std. Err. Alternative DF T-Stat P-Value
Intercept 6.8115594 0.2386902 ≠ 0 151 28.53724 <0.0001
Slope -0.052285775 0.0068210848 ≠ 0 151 -7.6653167 <0.0001

Analysis of variance table for regression model:
Source DF SS MS F-stat P-value
Model 1 54.3313 54.3313 58.75708 <0.0001
Error 151 139.62618 0.92467666
Total 152 193.95748

The new regression equation has the log(Facebook friends) instead of Facebook friends  for the dependant variable in the equation.  The slope remains negative with a low p-value, but the R-squared improves to 0.28011965 indicating a stronger fit.  Click on the arrow in the bottom right corner to see the scatterplot of this regression model.

Result 4: Graph after transformation   [Info]
Right click to copy

The variance problem has been mitigated.  As Age increases there is a consistent range above and below the red fitted line.  The variance appears to be homoscedastic in the transformed model.


Under the top left Options menu click Edit for one additional ability with this transformation feature. Under the Transformation header select the Use original units in graphs check box. Click Compute!  and click on the arrow in the bottom right color to see the new graph shown below.

Result 5: Untransformed graph   [Info]
Right click to copy

This graph is for the same transformed equation but it has the Facebook friends variable in its original units.  Likewise the red fitted line is untransformed to match the original units.

 

HTML link:
<A href="https://www.statcrunch.com/5.0/viewreport.php?reportid=38449">Transformations in Simple Linear Regression</A>

Comments
Want to comment? Subscribe
Already a member? Sign in.
By kellykarns
Jul 5, 2018

Thank you very much!! it was very helpful in explaining how to do this in statcrunch!!!!!!

Always Learning