INTRODUCTION
We are asked to evaluate the variation in physical health as it relates to age. Physical health is taken to be a patients’ BMI, calculated by obtaining their age (in years), their weight (lbs), and BMI. A total of 200 patients’ ages are taken and they are weighed. Health questions were used to determine their health scores and data was treated as interval/ratio.
METHODS
The goal of the analysis is to determine if there is any evidence of a relationship between physical health and age. For the purpose of this investigation all data obtained will be treated as interval/ratio, though it is noted that one obvious problem is that the questionnaire used to obtain the physical health score was undoubtedly ordinal in nature. Correlation analysis will determine the form and direction of any linear relationship that may exist between the variables, in addition to the strength of relationship.
A. Validity conditions for use in Pearson Correlation is described as being:
1. Interval/ratio data
2. Not nonlinear
3. Interpreted only where there exists a pvalue or confidence interval
4. Resulting from a reasonably large number of observations
Step 1: Form
A scatterplot will be used to assess the form of the data. Generally, nonlinear data will be invalid for analysis, and is indicated by a curved line.
The Scatterplot shows no evidence of curvature in the data points.
Step 2: Strength
The strength of the association will be given by a significant pvalue and an R^{2} value that can be described by the following descriptive terms:
· 0.1– 0.4 : Extremely Weak Correlation (Perhaps Clinically Unimportant)
· 0.5 – 0.6 : Weak Correlation
· 0.7 – 0.8 : Moderate Correlation
· 0.9 : Strong Correlation
Correlation between PHYSICAL and Age is: 0.19815521(0.005) 
The pvalue indicates a significant, negative correlation between age and physical health scores (r = 0.1982, pvalue = 0.005). The strength of the relationship is described by the R^{2} value = 0.04, suggesting that as the average physical health scores decreases as age increases. Though the relationship is extremely weak, and the significance is in question, we will continue with regression analysis of the variables.
ANALYSIS
This is an observational study since, as researchers, we are doing nothing to the experimental units to cause a reaction or effect. The response variable (Y) was physical health, while the predictor (X) was age. Both sets of data are considered interval/ratio.
A. Validity Conditions for Regression Analysis observations are described as being:
1. Linear
An extremely weak, negative linear relationship was indicated by the Pearson coefficient.
Also, in the fittedline plot there is no evidence of nonlinearity.
2. Normal
As observed by the QQplot, there is slight curvature of the data following the 45 degree line.
This indicates possible nonnormality, which may present a limitation in this study.
3. Independent
We must assume the observations are independent as we have no evidence to the contrary.
4. Homogenous
The residuals vs. predictor values plot shows no evidence of trends in variability.
Simple linear regression results:
Dependent Variable: PHYSICAL Independent Variable: Age PHYSICAL = 56.645076  0.32784052 Age Sample size: 199 R (correlation coefficient) = 0.19815521 Rsq = 0.039265485 Estimate of error standard deviation: 11.275907 Parameter estimates:
Analysis of variance table for regression model:

B. Interpretation of the ANOVA Table is as follows:
A. The predictor (X) is age, while the response (Y) is physical health score.
B. The regression equation is identified as being PHYSICAL = 56.645076  0.32784052 Age.
C. As discussed, the R2 value is = 0.04, meaning that only 4% of variability is explained by the model.
This is an extremely weak relationship.
D. The estimated Yintercept is 56.65. The youngest patients have a physical health score of 56.65.
The confidence interval is given as 0.09  0.55.
E. The Ftest for the relationship between the response and predictor is (8.05, pvalue 0.005).
The pvalue indicates that there is no evidence that physical health scores are related to age of the participants.
RESULTS/CONCLUSION
In general we lack evidence of any relationship between patients’ physical scores and their age. Had our pvalue been significant, at most we could have claimed an “extremely weak” descriptor. Such that the relationship would likely have been clinically unimportant. The scatterplot shows a great deal of variability in response for each variable measurement. The regression line is nearly horizontal.
FUTURE STUDIES
A different method for assessment of physical health is suggested. It appears that the current method may present data that is slightly skewed. From the scatterplot we saw that none of the patients who were less than 30 years old had low health scores. I would recommend a multiple regression analysis grouped by age groups to account for this phenomenon.
REFERENCES
Physical Health Scores/Mental Health Scores: Statcrunch.com
Already a member? Sign in.
Nov 17, 2015
Hi Tawana,
Good work! A couple comments:
1. Your interpretation of the pvalue is incorrect. Based on the pvalue = 0.005 (F=8.05), there is evidence of a relationship between age and physical health scores.
2. In addition, you should interpret the slope. With 95% confidence, interpreting the slope indicates that a one year increase in age will correspond to a 0.10 to 0.56 decrease in AVERAGE physical health score.
3. You were also asked to assess the relationship between age and mental health scores.
Please review the solutions and ask questions as needed.
Nov 14, 2015
Great report. You are always so detailed in your reports. I see that the association is weak and probably would not be very important. I do agree with you that maybe a multiple regression analysis would be helpful. I do not fully understand what this incorporates yet but from what I do know I think it is a good suggestion.
Nov 14, 2015
Great job on your report. It was very thorough. I was honestly really shocked that there was no relation with mental health and age as their was physical.
Nov 14, 2015
Good job on your report. it was very helpful for me. it is very easy to follow and read.
Nov 13, 2015
Good report. It does appear that there is evidence of a negative association between age and physical health score with a pvalue 0.005. However it seems to be an extremely weak association with R2 of 0.04 and wouldn't not be clinically important.
Nov 13, 2015
I think there is an association between age and the physical health score as evidenced by the significant p value and the intervals for the slope both being negative. My interpretation is that for every one unit increase in X (age), there is a 0.0999 to 0.555 decrease in the physical health score. As you stated above, the association is very weak which is probably why prediction intervals were so wide. When I plugged in the minimum age (21), a median age (35) and the maximum age (58) the 95% CIs were for 21 years  27.21, 72.31 for 35 years 22.88, 67.46 and for 58 years 14.77, 60.48. These are very wide and would not be useful for clinical practice. The mean 95% CIs were closer, as would be expected, with age 21 at 46.00, 53.52 age 35 at 43.58, 46.76 and age 58 at 32.37, 42.89. However, these would only be the average responses and tell us nothing about the individuals.