StatCrunch logo (home)

Report Properties

from Flickr
Owner: hillardt1
Created: Nov 12, 2015
Share: yes
Views: 1794
Results in this report
Data sets in this report
Need help?
To copy selected text, right click to Copy or choose the Copy option under your browser's Edit menu. Text copied in this manner can be pasted directly into most documents with formatting maintained.
To copy selected graphs, right click on the graph to Copy. When pasting into a document, make sure to paste the graph content rather than a link to the graph. For example, to paste in MS Word choose Edit > Paste Special, and select the Device Independent Bitmap option.
You can now also Mail results and reports. The email may contain a simple link to the StatCrunch site or the complete output with data and graphics attached. In addition to being a great way to deliver output to someone else, this is also a great way to save your own hard copy. To try it out, simply click on the Mail link.
Regression Analysis Age vs. Physical Health
Mail   Print   Twitter   Facebook


We are asked to evaluate the variation in physical health as it relates to age.  Physical health is taken to be a patients’ BMI, calculated by obtaining their age (in years), their weight (lbs), and BMI. A total of 200 patients’ ages are taken and they are weighed. Health questions were used to determine their health scores and data was treated as interval/ratio.


The goal of the analysis is to determine if there is any evidence of a relationship between physical health and age.  For the purpose of this investigation all data obtained will be treated as interval/ratio, though it is noted that one obvious problem is that the questionnaire used to obtain the physical health score was undoubtedly ordinal in nature.  Correlation analysis will determine the form and direction of any linear relationship that may exist between the variables, in addition to the strength of relationship. 

A. Validity conditions for use in Pearson Correlation is described as being:

1.      Interval/ratio data

2.      Not non-linear

3.      Interpreted only where there exists a p-value or confidence interval

4.      Resulting from a reasonably large number of observations

Step 1: Form

A scatterplot will be used to assess the form of the data.  Generally, non-linear data will be invalid for analysis, and is indicated by a curved line.

Result 1: Result 1: Scatter Plot Test for Evidence of Non-Linear Patterns   [Info]
Right click to copy

The Scatterplot shows no evidence of curvature in the data points. 

Step 2: Strength

The strength of the association will be given by a significant p-value and an R2 -value that can be described by the following descriptive terms:

·         0.1– 0.4 : Extremely Weak Correlation (Perhaps Clinically Unimportant)

·         0.5 – 0.6 : Weak Correlation

·         0.7 – 0.8 : Moderate Correlation

·         0.9 : Strong Correlation

Result 2: Result 2: Pearson Correlation Coefficient   [Info]
Correlation between PHYSICAL and Age is:

The p-value indicates a significant, negative correlation between age and physical health scores (r = -0.1982, p-value = 0.005). The strength of the relationship is described by the R2 value = 0.04, suggesting that as the average physical health scores decreases as age increases.  Though the relationship is extremely weak, and the significance is in question, we will continue with regression analysis of the variables.


This is an observational study since, as researchers, we are doing nothing to the experimental units to cause a reaction or effect.  The response variable (Y) was physical health, while the predictor (X) was age.  Both sets of data are considered interval/ratio. 


Result 3: Result 3: Validity Conditions Regression   [Info]
Right click to copy

A. Validity Conditions for Regression Analysis observations are described as being:

                1. Linear

                    An extremely weak, negative linear relationship was indicated by the Pearson coefficient.

                    Also, in the fitted-line plot there is no evidence of non-linearity.

                2. Normal

                    As observed by the QQ-plot, there is slight curvature of the data following the 45 degree line.  

                    This indicates possible non-normality, which may present a limitation in this study.

                3. Independent

                    We must assume the observations are independent as we have no evidence to the contrary.

                4. Homogenous

                   The residuals vs. predictor values plot shows no evidence of trends in variability.

Result 4: Result 4: ANOVA Table   [Info]
Simple linear regression results:
Dependent Variable: PHYSICAL
Independent Variable: Age
PHYSICAL = 56.645076 - 0.32784052 Age
Sample size: 199
R (correlation coefficient) = -0.19815521
R-sq = 0.039265485
Estimate of error standard deviation: 11.275907

Parameter estimates:
ParameterEstimateStd. Err.DF95% L. Limit95% U. Limit

Analysis of variance table for regression model:

B. Interpretation of the ANOVA Table is as follows:

                A. The predictor (X) is age, while the response (Y) is physical health score.

                B. The regression equation is identified as being PHYSICAL = 56.645076 - 0.32784052 Age.   

                C.  As discussed, the R2 value is = 0.04, meaning that only 4% of variability is explained by the model.  

                     This is an extremely weak relationship.

                D. The estimated Y-intercept is 56.65.  The youngest patients have a physical health score of 56.65.  

                     The confidence interval is given as 0.09 - 0.55. 

                E. The F-test for the relationship between the response and predictor is (8.05, p-value 0.005).  

                     The p-value indicates that there is no evidence that physical health scores are related to age of the participants.


In general we lack evidence of any relationship between patients’ physical scores and their age.  Had our p-value been significant, at most we could have claimed an “extremely weak” descriptor. Such that the relationship would likely have been clinically unimportant. The scatterplot shows a great deal of variability in response for each variable measurement. The regression line is nearly horizontal.


A different method for assessment of physical health is suggested.  It appears that the current method may present data that is slightly skewed.  From the scatterplot we saw that none of the patients who were less than 30 years old had low health scores.  I would recommend a multiple regression analysis grouped by age groups to account for this phenomenon.   


Physical Health Scores/Mental Health Scores:

HTML link:
<A href="">Regression Analysis Age vs. Physical Health</A>

Want to comment? Subscribe
Already a member? Sign in.
By nku.katie.waters
Nov 17, 2015

Hi Tawana,
Good work! A couple comments:
1. Your interpretation of the p-value is incorrect. Based on the p-value = 0.005 (F=8.05), there is evidence of a relationship between age and physical health scores.

2. In addition, you should interpret the slope. With 95% confidence, interpreting the slope indicates that a one year increase in age will correspond to a 0.10 to 0.56 decrease in AVERAGE physical health score.

3. You were also asked to assess the relationship between age and mental health scores.
Please review the solutions and ask questions as needed.
By billie.howard14
Nov 14, 2015

Great report. You are always so detailed in your reports. I see that the association is weak and probably would not be very important. I do agree with you that maybe a multiple regression analysis would be helpful. I do not fully understand what this incorporates yet but from what I do know I think it is a good suggestion.
By sarah.sams
Nov 14, 2015

Great job on your report. It was very thorough. I was honestly really shocked that there was no relation with mental health and age as their was physical.
By kelli.koors
Nov 14, 2015

Good job on your report. it was very helpful for me. it is very easy to follow and read.
By brittanie.preston
Nov 13, 2015

Good report. It does appear that there is evidence of a negative association between age and physical health score with a p-value 0.005. However it seems to be an extremely weak association with R2 of 0.04 and wouldn't not be clinically important.
By christina.jessee
Nov 13, 2015

I think there is an association between age and the physical health score as evidenced by the significant p value and the intervals for the slope both being negative. My interpretation is that for every one unit increase in X (age), there is a 0.0999 to 0.555 decrease in the physical health score. As you stated above, the association is very weak which is probably why prediction intervals were so wide. When I plugged in the minimum age (21), a median age (35) and the maximum age (58) the 95% CIs were for 21 years - 27.21, 72.31 for 35 years- 22.88, 67.46 and for 58 years- 14.77, 60.48. These are very wide and would not be useful for clinical practice. The mean 95% CIs were closer, as would be expected, with age 21 at 46.00, 53.52 age 35 at 43.58, 46.76 and age 58 at 32.37, 42.89. However, these would only be the average responses and tell us nothing about the individuals.

Always Learning