icc {sjstats}R Documentation

Intraclass-Correlation Coefficient

Description

This function calculates the intraclass-correlation (icc) - sometimes also called variance partition coefficient (vpc) - for random intercepts of mixed effects models. Currently, merMod, glmmTMB, stanreg and brmsfit objects are supported.

Usage

icc(x, ..., posterior = FALSE)

Arguments

x

Fitted mixed effects model (of class merMod, glmmTMB, stanreg or brmsfit).

...

More fitted model objects, to compute multiple intraclass-correlation coefficients at once.

posterior

Logical, if TRUE and x is a brmsfit object, ICC values are computed for each sample of the posterior distribution. In this case, a data frame is returned with the same number of rows as samples in x, with one column per random effect ICC.

Details

The ICC is calculated by dividing the between-group-variance (random intercept variance) by the total variance (i.e. sum of between-group-variance and within-group (residual) variance).

The calculation of the ICC for generalized linear mixed models with binary outcome is based on Wu et al. (2012). For Poisson multilevel models, please refer to Stryhn et al. (2006). Aly et al. (2014) describe computation of ICC for negative binomial models.

Caution: For models with random slopes and random intercepts, the ICC would differ at each unit of the predictors. Hence, the ICC for these kind of models cannot be understood simply as proportion of variance (see Goldstein et al. 2010). For convenience reasons, as the icc() function also extracts the different random effects variances, the ICC for random-slope-intercept-models is reported nonetheless, but it is usually no meaningful summary of the proportion of variances.

If posterior = FALSE, there is a print()-method that prints the variance parameters using the comp-argument set to "var": print(x, comp = "var") (see 'Examples'). The re_var-function is a convenient wrapper. If posterior = TRUE, the print()-method accepts the arguments prob and digits, which indicate the probability of the uncertainty interval for the ICC and variance components, and the digits in the output (see also 'Examples').

The random effect variances indicate the between- and within-group variances as well as random-slope variance and random-slope-intercept correlation. The components are denoted as following:

Value

If posterior = FALSE (the default), a numeric vector with all random intercept intraclass-correlation-coefficients, or a list of numeric vectors, when more than one model were used as arguments. Furthermore, between- and within-group variances as well as random-slope variance are returned as attributes.

If posterior = TRUE, icc() returns a data frame with ICC and variance components for each sample of the posterior distribution.

Note

Some notes on why the ICC is useful, based on Grace-Martin:

In short, the ICC can be interpreted as “the proportion of the variance explained by the grouping structure in the population” (Hox 2002: 15).

Usually, the ICC is calculated for the null model ("unconditional model"). However, according to Raudenbush and Bryk (2002) or Rabe-Hesketh and Skrondal (2012) it is also feasible to compute the ICC for full models with covariates ("conditional models") and compare how much a level-2 variable explains the portion of variation in the grouping structure (random intercept).

Caution: For three-level-models, depending on the nested structure of the model, the ICC only reports the proportion of variance explained for each grouping level. However, the proportion of variance for specific levels related to each other (e.g., similarity of level-1-units within level-2-units or level-2-units within level-3-units) must be computed manually. Use get_re_var to get the between-group-variances and residual variance of the model, and calculate the ICC for the various level correlations.

For example, for the ICC between level 1 and 2:
sum(get_re_var(fit)) / (sum(get_re_var(fit)) + get_re_var(fit, "sigma_2"))

or for the ICC between level 2 and 3:
get_re_var(fit)[2] / sum(get_re_var(fit))

References

Further helpful online-ressources:

See Also

re_var

Examples

library(lme4)
fit0 <- lmer(Reaction ~ 1 + (1 | Subject), sleepstudy)
icc(fit0)

# note: ICC for random-slope-intercept model usually not
# meaningful - see 'Note'.
fit1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
icc(fit1)

sleepstudy$mygrp <- sample(1:45, size = 180, replace = TRUE)
fit2 <- lmer(Reaction ~ Days + (1 | mygrp) + (1 | Subject), sleepstudy)
icc(fit2)

# return icc for all models at once
icc(fit0, fit1, fit2)

icc1 <- icc(fit1)
icc2 <- icc(fit2)

print(icc1, comp = "var")
print(icc2, comp = "var")

## Not run: 
# compute ICC for Bayesian mixed model, with an ICC for each
# sample of the posterior. The print()-method then shows
# the median ICC as well as 89% HDI for the ICC.
# Change interval with print-method:
# print(icc(m, posterior = TRUE), prob = .5)

if (requireNamespace("brms", quietly = TRUE)) {
  library(dplyr)
  sleepstudy$mygrp <- sample(1:5, size = 180, replace = TRUE)
  sleepstudy <- sleepstudy %>%
    group_by(mygrp) %>%
    mutate(mysubgrp = sample(1:30, size = n(), replace = TRUE))
  m <- brm(
    Reaction ~ Days + (1 | mygrp / mysubgrp) + (1 | Subject),
    data = sleepstudy
  )

  # by default, 89% interval
  icc(m, posterior = TRUE)

  # show 50% interval
  print(icc(m, posterior = TRUE), prob = .5, digits = 2)
}
## End(Not run)


[Package sjstats version 0.14.2-3 Index]