Repeated-measures / within-subjects ANOVA in R

20,105

Solution 1

Try using the lmer function in the lme4 package. The aov function is probably not appropriate here. Look for references from Dougles Bates, e.g. http://lme4.r-forge.r-project.org/book/Ch4.pdf (the other chapters are great too, but that is the repeated measures chapter, this is the intro: http://lme4.r-forge.r-project.org/book/Ch1.pdf). The R code is at the same place and for longitudinal data, it seems to be generally considered wrong these days to just fit OLS instead of a components of variance model like in the lme4 package, or in nlme, which to me seems to have been wildly overtaken by lme4 in popularity recently. You may note Brian Ripley's referenced post in the comments section above just recommends switching to lme also.

By the way, a huge advantage off the jump is you will be able to get estimates for the level of each effect as adjustments to the grand mean with the typical syntax:

lmer(DV ~ 1  +IV1*IV2*IV3 +(IV1*IV2*IV3|Subject), dataset))

Note your random effects will be vector valued.

Solution 2

I know the answer has been chosen for this post. I still wish to point out how to specify a correct error term/random effect when fitting a aov or lmer model to a multi-way repeated-measures data. I assume that both independent variables (IVs) are fixed, and are crossed with each other and with subjects, meaning all subjects are exposed to all combinations of the IVs. I am going to use data taken from Kirk’s Experimental Deisign: Procedures for the Behavioral Sciences (2013).

library(lme4)
library(foreign)
library(lmerTest)
library(dplyr)

file_name <- "http://www.ats.ucla.edu/stat/stata/examples/kirk/rbf33.dta" #1
d <- read.dta(file_name) %>%                                              #2
  mutate(a_f = factor(a), b_f = factor(b), s_f = factor(s))               #3

head(d)
    ##   a b s  y a_f b_f s_f
    ## 1 1 1 1 37   1   1   1
    ## 2 1 2 1 43   1   2   1
    ## 3 1 3 1 48   1   3   1
    ## 4 2 1 1 39   2   1   1
    ## 5 2 2 1 35   2   2   1

In this study 5 subjects (s) are exposed to 2 treatments - type of beat (a) and training duration (b) - with 3 levels each. The outcome variable is the attitude toward minority. In #3 I made a, b, and s into factor variables, a_f, b_f, and s_f. Let p and q be the numbers of levels for a_f and b_f (3 each), and n be the sample size (5).

In this example the degrees of freedom (dfs) for the tests of a_f, b_f, and their interaction should be p-1=2, q-1=2, and (p-1)*(q-1)=4, respectively. The df for the s_f error term is (n-1) = 4, and the df for the Within (s_f:a_f:b_f) error term is (n-1)(pq-1)=32. So the correct model(s) should give you these dfs.

Using aov

Now let’s try different model specifications using aov:

aov(y ~ a_f*b_f + Error(s_f), data=d) %>% summary()         # m1

aov(y ~ a_f*b_f + Error(s_f/a_f:b_f), data=d) %>% summary() # m2

aov(y ~ a_f*b_f + Error(s_f/a_f*b_f), data=d) %>% summary() # m3

Simply specifying the error as Error(s_f) in m1 gives you the correct dfs and F-ratios matching the values in the book. m2 also gives the same value as m1, but also the infamous “Warning: Error() model is singular”. m3 is doing something strange. It is further partitioning Within residuals in m1 (634.9) into residuals for three error terms: s_f:a_f (174.2), s_f:b_f (173.6), and s_f:a_f:b_f (287.1). This is wrong, since we would not get three error terms when we run a 2-way between-subjects ANOVA! Multiple error terms are also contrary to the point of using block factorial designs, which allows us to use the same error term for the test of A, B, and AB, unlike split-plot designs which requires 2 error terms.

Using lmer

How can we get the same dfs and F-values using lmer? If your data is balanced, the Kenward-Roger approximation used in the lmerTest will give you exact dfs and F-ratio.

lmer(y ~ a_f*b_f + (1|s_f), data=d) %>% anova()         # mem1

lmer(y ~ a_f*b_f + (1|s_f/a_f:b_f), data=d) %>% anova() # mem2

lmer(y ~ a_f*b_f + (1|s_f/a_f*b_f), data=d) %>% anova() # mem3

lmer(y ~ a_f*b_f + (1|s_f:a_f:b_f), data=d) %>% anova() # mem4

lmer(y ~ a_f*b_f + (a_f*b_f|s_f), data=d) %>% anova()   # mem5

Again simply specifying the random effect as (1|s_f) give you the correct dfs and F-ratios (mem1). mem2-5 did not even give results, presumably the numbers of random effects it needed to estimate were greater than the sample size.

Share:
20,105
vize
Author by

vize

Updated on August 09, 2020

Comments

  • vize
    vize almost 4 years

    I'm attempting to run a repeated-meaures ANOVA using R. I've gone through various examples on various websites, but they never seem to talk about the error that I'm encountering. I assume I'm misunderstanding something important.

    The ANOVA I'm trying to run is on some data from an experiment using human participants. It has one DV and three IVs. All of the levels of all of the IVs are run on all participants, making it a three-way repeated-measures / within-subjects ANOVA.

    The code I'm running in R is as follows:

    aov.output = aov(DV~ IV1 * IV2 * IV3 + Error(PARTICIPANT_ID / (IV1 * IV2 * IV3)),
                     data=fulldata)
    

    When I run this, I get the following warning:

    Error() model is singular
    

    Any ideas what I might be doing wrong?

  • vize
    vize about 13 years
    Great, thanks! It seems to be working now. Thanks very much for the tip, lmers seem the way to go :)