A new feature to be able to transform data has recently been added to the StatCrunch simple linear regression feature. An example of how this feature can be useful will be demonstrated with a regression of how well an individual's Age predicts the number of Facebook friends for the survey data shown below.
To begin, select the Stat > Regression > Simple Linear menu option. In the resulting dialog window, select the Age column as the X variable, and the Facebook friends column as the Y variable. Click Compute! to see the initial results shown below.
Simple linear regression results:
Dependent Variable: Facebook friends Independent Variable: Age Facebook friends = 537.17278  8.4196831 Age Sample size: 153 R (correlation coefficient) = 0.42868834 Rsq = 0.18377369 Estimate of error standard deviation: 203.56936 Parameter estimates:
Analysis of variance table for regression model:

For this regression the slope is negative with a pvalue <0.0001 and a Rsquared of 0.18377369. This tells us that there is a negative correlation between an individual’s age and the number of Facebook friends. Click on the arrow in the bottom right corner to see the scatterplot shown below.
Notice how this scatterplot displays a “fan behavior” around the prediction: at age 20 there is a wider range of the number of Facebook friends than at age 50 and age 60. One way to correct this problem having unequal variance is to do a data transformation.
Under the top left Options menu click Edit to change the regression results. Under the Transformation header select Natural logarithm: log(y) to depress the variance of the number of Facebook friends. Click Compute! to see the results shown below.
Simple linear regression results (w/ transformation):
Dependent Variable: log(Facebook friends) Independent Variable: Age log(Facebook friends) = 6.8115594  0.052285775 Age Sample size: 153 R (correlation coefficient) = 0.52926331 Rsq = 0.28011965 Estimate of error standard deviation: 0.96160109 Parameter estimates:
Analysis of variance table for regression model:

The new regression equation has the log(Facebook friends) instead of Facebook friends for the dependant variable in the equation. The slope remains negative with a low pvalue, but the Rsquared improves to 0.28011965 indicating a stronger fit. Click on the arrow in the bottom right corner to see the scatterplot of this regression model.
The variance problem has been mitigated. As Age increases there is a consistent range above and below the red fitted line. The variance appears to be homoscedastic in the transformed model.
Under the top left Options menu click Edit for one additional ability with this transformation feature. Under the Transformation header select the Use original units in graphs check box. Click Compute! and click on the arrow in the bottom right color to see the new graph shown below.
This graph is for the same transformed equation but it has the Facebook friends variable in its original units. Likewise the red fitted line is untransformed to match the original units.
Already a member? Sign in.
Jul 5, 2018
Thank you very much!! it was very helpful in explaining how to do this in statcrunch!!!!!!