Report Properties
Thumbnail:

from Flickr
Owner: hillardt1
Created: Nov 12, 2015
Share: yes
Views: 1794
Tags:

Results in this report

Data sets in this report
None

Need help?
To copy selected text, right click to Copy or choose the Copy option under your browser's Edit menu. Text copied in this manner can be pasted directly into most documents with formatting maintained.
To copy selected graphs, right click on the graph to Copy. When pasting into a document, make sure to paste the graph content rather than a link to the graph. For example, to paste in MS Word choose Edit > Paste Special, and select the Device Independent Bitmap option.
You can now also Mail results and reports. The email may contain a simple link to the StatCrunch site or the complete output with data and graphics attached. In addition to being a great way to deliver output to someone else, this is also a great way to save your own hard copy. To try it out, simply click on the Mail link.
Regression Analysis Age vs. Physical Health

INTRODUCTION

We are asked to evaluate the variation in physical health as it relates to age.  Physical health is taken to be a patients’ BMI, calculated by obtaining their age (in years), their weight (lbs), and BMI. A total of 200 patients’ ages are taken and they are weighed. Health questions were used to determine their health scores and data was treated as interval/ratio.

METHODS

The goal of the analysis is to determine if there is any evidence of a relationship between physical health and age.  For the purpose of this investigation all data obtained will be treated as interval/ratio, though it is noted that one obvious problem is that the questionnaire used to obtain the physical health score was undoubtedly ordinal in nature.  Correlation analysis will determine the form and direction of any linear relationship that may exist between the variables, in addition to the strength of relationship.

A. Validity conditions for use in Pearson Correlation is described as being:

1.      Interval/ratio data

2.      Not non-linear

3.      Interpreted only where there exists a p-value or confidence interval

4.      Resulting from a reasonably large number of observations

Step 1: Form

A scatterplot will be used to assess the form of the data.  Generally, non-linear data will be invalid for analysis, and is indicated by a curved line.

Result 1: Result 1: Scatter Plot Test for Evidence of Non-Linear Patterns   [Info]

The Scatterplot shows no evidence of curvature in the data points.

Step 2: Strength

The strength of the association will be given by a significant p-value and an R2 -value that can be described by the following descriptive terms:

·         0.1– 0.4 : Extremely Weak Correlation (Perhaps Clinically Unimportant)

·         0.5 – 0.6 : Weak Correlation

·         0.7 – 0.8 : Moderate Correlation

·         0.9 : Strong Correlation

Result 2: Result 2: Pearson Correlation Coefficient   [Info]
 Correlation between PHYSICAL and Age is:-0.19815521(0.005)

The p-value indicates a significant, negative correlation between age and physical health scores (r = -0.1982, p-value = 0.005). The strength of the relationship is described by the R2 value = 0.04, suggesting that as the average physical health scores decreases as age increases.  Though the relationship is extremely weak, and the significance is in question, we will continue with regression analysis of the variables.

ANALYSIS

This is an observational study since, as researchers, we are doing nothing to the experimental units to cause a reaction or effect.  The response variable (Y) was physical health, while the predictor (X) was age.  Both sets of data are considered interval/ratio.

Result 3: Result 3: Validity Conditions Regression   [Info]

A. Validity Conditions for Regression Analysis observations are described as being:

1. Linear

An extremely weak, negative linear relationship was indicated by the Pearson coefficient.

Also, in the fitted-line plot there is no evidence of non-linearity.

2. Normal

As observed by the QQ-plot, there is slight curvature of the data following the 45 degree line.

This indicates possible non-normality, which may present a limitation in this study.

3. Independent

We must assume the observations are independent as we have no evidence to the contrary.

4. Homogenous

The residuals vs. predictor values plot shows no evidence of trends in variability.

Result 4: Result 4: ANOVA Table   [Info]
Simple linear regression results:
Dependent Variable: PHYSICAL
Independent Variable: Age
PHYSICAL = 56.645076 - 0.32784052 Age
Sample size: 199
R (correlation coefficient) = -0.19815521
R-sq = 0.039265485
Estimate of error standard deviation: 11.275907

Parameter estimates:
ParameterEstimateStd. Err.DF95% L. Limit95% U. Limit
Intercept56.6450764.232651919748.29795264.9922
Slope-0.327840520.11553823197-0.55569105-0.09998999

Analysis of variance table for regression model:
SourceDFSSMSF-statP-value
Model11023.70971023.70978.05144450.005
Error19725047.779127.14609
Total19826071.489

B. Interpretation of the ANOVA Table is as follows:

A. The predictor (X) is age, while the response (Y) is physical health score.

B. The regression equation is identified as being PHYSICAL = 56.645076 - 0.32784052 Age.

C.  As discussed, the R2 value is = 0.04, meaning that only 4% of variability is explained by the model.

This is an extremely weak relationship.

D. The estimated Y-intercept is 56.65.  The youngest patients have a physical health score of 56.65.

The confidence interval is given as 0.09 - 0.55.

E. The F-test for the relationship between the response and predictor is (8.05, p-value 0.005).

The p-value indicates that there is no evidence that physical health scores are related to age of the participants.

RESULTS/CONCLUSION

In general we lack evidence of any relationship between patients’ physical scores and their age.  Had our p-value been significant, at most we could have claimed an “extremely weak” descriptor. Such that the relationship would likely have been clinically unimportant. The scatterplot shows a great deal of variability in response for each variable measurement. The regression line is nearly horizontal.

FUTURE STUDIES

A different method for assessment of physical health is suggested.  It appears that the current method may present data that is slightly skewed.  From the scatterplot we saw that none of the patients who were less than 30 years old had low health scores.  I would recommend a multiple regression analysis grouped by age groups to account for this phenomenon.

REFERENCES

Physical Health Scores/Mental Health Scores:  Statcrunch.com