|Year : 2016 | Volume
| Issue : 1 | Page : 95-97
Analysis of repeated measures data: A quick primer
Jill C Stoltzfus
The Research Institute, St. Luke's University Health Network, Bethlehem; Temple University School of Medicine, Philadelphia, PA, USA
|Date of Submission||19-Jan-2016|
|Date of Acceptance||16-Feb-2016|
|Date of Web Publication||2-Jun-2016|
Jill C Stoltzfus
St. Luke's University Health Network, 801 Ostrum Street, Bethlehem, PA 18015
Source of Support: None, Conflict of Interest: None
When analyzing data for dependent groups (e.g., before and after intervention), one use repeated measures statistical tests that account for the correlated observations. For normally distributed data measured on a continuous/interval scale (e.g., fasting glucose) with only two points of measurement (e.g., before and after), one would conduct a paired t-test. For more than two measurement points (e.g., baseline, 3 months, 6 months), repeated measures analysis of variance is appropriate. For skewed continuous/interval data (e.g., body mass index in the general population), or ordinal data (e.g., visual analog pain scores), one could conduct a Wilcoxon signed-rank test (for two measurement points) or a Friedman's test (for more than two measurement points).
The following core competencies are addressed in this article: Medical knowledge.
Keywords: Dependent groups, repeated measures, within-group analysis
|How to cite this article:|
Stoltzfus JC. Analysis of repeated measures data: A quick primer. Int J Acad Med 2016;2:95-7
| Introduction|| |
When analyzing data for dependent groups (e.g., before and after intervention), it is not appropriate to treat each observation as if it were completely separate from the others, since there is a clear correlation between responses. In such cases, one must use repeated measures statistical tests that account for these correlated observations, since failure to do so will produce inaccurate results. The type of repeated measures statistical test depends on the level of data measurement, general distribution of the data, and number of observations.
| Statistical Approaches To Repeated Measures Data|| |
For data measured on continuous/interval scale (i.e., exists on a continuum that can be added, subtracted, multiplied, or divided) that are normally distributed (i.e., look like a “bell-shaped curve”), with only two points of paired measurements (e.g., before and after), one would conduct a paired t-test. The paired t-test compares the mean difference of the two different measurement points using the formula in [Figure 1]:
Where the numerator represents the mean difference between the paired measurements, and the denominator includes the square root of the sample variance divided by the sample size.
For normally distributed continuous/interval data and more than two measurement points (e.g., baseline, 3 months, 6 months), repeated measures analysis of variance (ANOVA) is the appropriate method, with a comparison of (correlated) mean values at each point in time. The basic formula for repeated measures ANOVA (also referred to as “within-groups” analysis) is presented in [Figure 2]:
Where the numerator represents the difference between the means of the different measurement points, and the denominator represents variability in the means due to chance. Therefore, the F statistic will increase as the difference between means gets larger or the variability due to chance gets smaller.
In the above formula, “mean squares” refers to the end result of several different computations that involve calculating how subjects' mean outcomes differ from the grand mean, which is expressed as “sums of squares,” since values are squared before being added together.
For interested readers, the basic computations involved in repeated measures ANOVA  are presented in [Figure 3] below:
Within the scope of repeated measures ANOVA, one might also decide to include between-groups components, or “factors,” such as gender, age groups, different levels of treatment, and so on. When there are two factors (within-groups, between-groups, or both), the ANOVA is referred to as “two-way,” while an ANOVA with more than two factors is called “multifactorial.”
As an example of a between-groups factor for gender, do men's and women's mean body mass indexes (BMIs) differ when measured at baseline, 3 months, 6 months, and 12 months following a diet and exercise intervention? Besides considering gender by itself, one should also examine the interaction between gender and the intervention – specifically, do men respond differently than women to the diet and exercise program in terms of their BMI changes over time? This interactional relationship is important to explore because it may be neither gender nor the intervention alone contributes to BMI changes as significantly as the interaction between the two variables.
For skewed continuous/interval data (e.g., BMI in the general population), or ordinal outcomes that are based on ranks or ratings (e.g., visual analog pain scores, patient satisfaction scales), one could conduct a Wilcoxon signed-rank test (for two measurement points) or a Friedman's test (for more than two measurement points). Both of these tests are nonparametric, meaning they should be used when data do not follow meet certain assumptions (parameters), including having skewed distributions, small sample sizes, and data measured on an ordinal scale. Therefore, the Wilcoxon signed-rank test is the nonparametric alternative to the paired t-test, while the Friedman's test is the nonparametric alternative to the one-way repeated measures ANOVA.
For the Wilcoxon signed-rank test, one would find the absolute difference between the two measurement point values (i.e., remove all positive and negative signs from the values), then rank these absolute differences from lowest to highest, and finally reassign the positive and negative signs from the original values (i.e., “signed-ranks”).
For the Friedman's test, which involves more measurement points and is, therefore, a bit more complex to compute, a simple example will illustrate how it works. Suppose a physician wants to assess patient pain scores using a ten-point visual analog scale at baseline, 2 h, and 4 h following a new mindfulness meditation training program. [Table 1] presents the results for the first ten patients.
|Table 1: A set of hypothetical results to accompany the Friedman's test example|
Click here to view
Within each row above (per subject), one would rank-order the three separate values from lowest to highest, add up the ranked scores for each separate column, then square each column total and apply the resulting values into the equation presented in [Figure 4] below:
Where k is the number of separate measurement points, Σ is the symbol for “sum,” and R represents the ranked scores hen citing numbers.
For both the Wilcoxon signed-rank test and the Friedman's test, one would report medians, which are a more appropriate measure of central tendency than the mean with skewed and ordinal data. In addition, one could include either raw ranges or interquartile ranges (i.e., the difference between the 25th and 75th percentiles) when reporting the median.
| Conclusion|| |
Analyzing data for dependent groups requires the use of special statistical tests to account for the correlated observations. There are a variety of repeated measures approaches that one may apply for such purposes, depending on the data's level of measurement and the number of separate observation points.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
| References|| |
Motulsky J. Intuitive Biostatistics. New York: Oxford University Press; 1995.
Tabachnick B, Fidell L. Experimental Designs Using ANOVA. United States: Thomson Brooks/Cole; 2007.
Hollander M, Wolfe DA. Nonparametric Statistical Methods. 2nd
ed.. New York: John Wiley & Sons; 1999.
[Figure 1], [Figure 2], [Figure 3], [Figure 4]