In multilevel analyses, it is often necessary to estimate a standardised mean difference (SMD) such that the numerator mean effect comes from a different model than the denominator SD. The numerator could be a covariate-adjusted mean difference and the denominator the SD from the sum of residual and random effect variances in a model without covariate adjustment. For example, statistical analysis guidance for EEF evaluations requires this approach, alongside confidence intervals.
This note simulates data with known sample effect size, covariate-outcome association, and random effect variances, and estimates the SMD using the two packages in R I know that directly do the sums:
TL;DR summary: surprisingly, {eefAnalytics} does not follow EEF guidance. I can, however, replicate what it does using {lmeInfo}. Also I can get {lmeInfo} to provide the answer I expect.
(I’m aware of arguments against SMDs, e.g., Tukey, 1969, and arguments for using an SD from a survey of the population of interest.)
quiet_library <- \(...) library(...) |> suppressPackageStartupMessages()
quiet_library(tidyverse)
quiet_library(lme4)
quiet_library(tictoc)
quiet_library(beepr)
quiet_library(lmeInfo)
quiet_library(eefAnalytics)
quiet_library(nlme)
quiet_library(lmeInfo)
n_schools <- 1000
n_pupils <- 100
n_total <- n_schools * n_pupils
n_treat <- n_total / 2
cov_var <- 0.6 # proportion of variance explained by the pupil covariate
school_var <- 0.3 # proportion of variance explained by school
es <- 1.23 # the effect size we want to simulate
stopifnot(n_total %% 2 == 0)
stopifnot(n_schools %% 2 == 0)
The SMD we’re looking for is 1.23 – keep an eye out for that number!
Note that I’m trying to get the sample to have exactly the properties I want, hence all the scaling to get sample SD = 1 and use of residuals to remove sample correlations.
set.seed(42)
my_scale <- function(...) scale(...) |> as.numeric()
school_sim <- tibble(school = 1:n_schools,
noise_school = rnorm(n_schools))
sim_dat <- tibble(
id = 1:n_total,
school = rep(1:n_schools, n_pupils) |> sort(),
treat = (school <= n_schools / 2) + 0
) |>
left_join(school_sim, by = "school") |>
mutate(
ranef_school = my_scale(lm(noise_school ~ treat)$resid) * sqrt(school_var),
noise1 = lm(rnorm(n_total) ~ ranef_school + factor(school))$resid |>
my_scale(),
noise2 = lm(rnorm(n_total) ~ noise1 + ranef_school + factor(school))$resid |>
my_scale(),
resid = noise1 * sqrt(1 - cov_var - school_var),
covar = noise2 * sqrt(cov_var),
y = resid + ranef_school + covar + es * treat
)
sim_dat |>
select(-starts_with("noise")) |>
pivot_longer(cols = everything()) |>
filter(!name %in% c("id", "school")) |>
group_by(name) |>
summarise(
mean = mean(value),
sd = sd(value),
var = sd^2,
min = min(value),
max = max(value)
) |>
mutate(across(where(is.numeric), \(x) round(x, 2)))
The SD for y is greater than 1 since y is a mixture of normals. We can get (nearly) 1 by calculating the SD by group and pooling:
by_group_summary <- sim_dat |>
group_by(treat) |>
summarise(mean_y = mean(y), sd_y = sd(y), n = n())
by_group_summary |> round(3)
by_group_summary$sd_y^2 |> mean() |> sqrt()
[1] 1.000005
Here’s a picture:
sim_dat |>
ggplot(aes(x = y, fill = factor(treat, labels = c("Control", "Treat")))) +
geom_histogram(alpha = 0.5,
position = "identity",
bins = 100) +
labs(fill = "Group", x = "Outcome", y = "Count") +
theme_minimal()
Here are the covariances:
sim_dat |>
select(-starts_with("noise")) |>
select(-c("id", "school")) |>
cov() |>
round(2) |>
as.data.frame()
(In case you too try this approach, crtFREQ wants the random effect ID to be a numeric and doesn’t like factors.)
This first example includes only treatment group and random effect, with no covariate. It should provide the correct answer, though this isn’t generally a good idea as the confidence interval will be too wide if your covariates explain variation in the outcome:
eefa_uncond <- crtFREQ(
y ~ treat,
random = "school",
intervention = "treat",
data = sim_dat
)
Computing profile confidence intervals ...
eefa_uncond
Model Info:
Method: MLM
Design: CRT
Approach: Frequentist
Function: crtFREQ
---------
Result for: Conditional effect size
$treat1
Estimate 95% LB 95% UB
Within 1.46 1.38 1.54
Total 1.23 1.16 1.30
Result for: Unconditional effect size
$treat1
Estimate 95% LB 95% UB
Within 1.46 1.34 1.58
Total 1.05 0.96 1.14
Please use summary to get more results
should_be_correct <- eefa_uncond$ES$treat1["Total",]$Estimate
stopifnot(round(should_be_correct, 2) == round(es, 2))
As expected, the conditional effect size using the total variance is 1.23, matching the target.
The following call with the covariate included yields too large an answer for the conditional total and too small an answer for the unconditional total effect:
eef_wrong <- crtFREQ(
y ~ treat + covar,
random = "school",
intervention = "treat",
data = sim_dat
)
Computing profile confidence intervals ...
eef_wrong
Model Info:
Method: MLM
Design: CRT
Approach: Frequentist
Function: crtFREQ
---------
Result for: Conditional effect size
$treat1
Estimate 95% LB 95% UB
Within 3.87 3.66 4.08
Total 1.94 1.82 2.07
Result for: Unconditional effect size
$treat1
Estimate 95% LB 95% UB
Within 1.46 1.34 1.58
Total 1.05 0.96 1.14
Please use summary to get more results
too_big_es <- eef_wrong$ES$treat1["Total",]$Estimate
too_small_es <- eef_wrong$Unconditional$ES$treat1["Total",]$Estimate
stopifnot(too_big_es > es)
stopifnot(too_small_es < es)
We’re aiming for 1.23. The total conditional effect size is 1.94 and the unconditional is 1.05.
The model behind the conditional effect size shrinks the residuals, as the covariate explains variance in the outcome. The denominator is smaller than we want, so the SMD is too large.
The model behind the unconditional effect size denominator doesn’t take account of the intervention increasing the between-school variance as it pushes the means apart. We can see that if we examine the school random effects in the unconditional model:
uncond_lmer <- lmer(y ~ (1|school), data = sim_dat)
summary(uncond_lmer)
Linear mixed model fit by REML ['lmerMod']
Formula: y ~ (1 | school)
Data: sim_dat
REML criterion at convergence: 253693.2
Scaled residuals:
Min 1Q Median 3Q Max
-4.1978 -0.6704 -0.0001 0.6682 4.7211
Random effects:
Groups Name Variance Std.Dev.
school (Intercept) 0.6718 0.8197
Residual 0.7071 0.8409
Number of obs: 100000, groups: school, 1000
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.61500 0.02606 23.6
Here’s a picture of the school random effects showing what’s going on:
plot_ranefs <- function(mod) {
school_ranefs <- ranef(mod)$school |>
rename("intercept" = "(Intercept)") |>
rownames_to_column("school") |>
mutate(school = as.numeric(school),
treat = (school <= n_schools / 2) + 0)
school_ranefs |>
ggplot(aes(x = intercept, fill = factor(treat, labels = c("Control", "Treat")))) +
geom_histogram(alpha = 0.5,
position = "identity",
bins = 21) +
labs(fill = "Group", x = "School random effect", y = "Count") +
theme_minimal()
}
plot_ranefs(uncond_lmer)
Let’s look again, this time using a model that includes the treatment predictor:
treat_lmer <- lmer(y ~ treat + (1|school), data = sim_dat)
summary(treat_lmer)
Linear mixed model fit by REML ['lmerMod']
Formula: y ~ treat + (1 | school)
Data: sim_dat
REML criterion at convergence: 252883.2
Scaled residuals:
Min 1Q Median 3Q Max
-4.1922 -0.6704 -0.0005 0.6683 4.7100
Random effects:
Groups Name Variance Std.Dev.
school (Intercept) 0.2935 0.5418
Residual 0.7071 0.8409
Number of obs: 100000, groups: school, 1000
Fixed effects:
Estimate Std. Error t value
(Intercept) 3.093e-13 2.452e-02 0.00
treat 1.230e+00 3.468e-02 35.47
Correlation of Fixed Effects:
(Intr)
treat -0.707
Note how much smaller the school random effect variance is now (almost the 0.3 I was originally aiming for – see the simulation parameters). The picture:
plot_ranefs(treat_lmer)
This is the model we want for the denominator.
Let’s see if {lmeInfo} can produce the SMD we want. First, fit a model for the numerator (effect estimate) and denominator (SD). I’ve also fitted an intercept-only model, for use later when replicating {eefAnalytics}’s results.
lme_num <- lme(
fixed = y ~ treat + covar,
random = ~ 1 | school,
data = sim_dat,
method = "REML",
control = lmeControl(opt = "optim")
)
lme_den <- lme(
fixed = y ~ treat,
random = ~ 1 | school,
data = sim_dat,
method = "REML",
control = lmeControl(opt = "optim")
)
lme_empty <- lme(
fixed = y ~ 1,
random = ~ 1 | school,
data = sim_dat,
method = "REML",
control = lmeControl(opt = "optim")
)
Estimate the SMD:
correct_es_est <- g_mlm(
lme_num,
p_const = c(0,1,0),
mod_denom = lme_den,
r_const = c(1,1),
infotype = "expected",
separate_variances = FALSE
)
correct_es_est
est se
unadjusted effect size 1.23 0.036
adjusted effect size 1.23 0.036
degree of freedom 10484.71
stopifnot(round(correct_es_est$g_AB, 3) == es)
The estimate is as desired, to three decimal places. In full:
correct_es_est$g_AB
[1] 1.229549
We can also get a CI:
CI_g(correct_es_est) |> round(2)
[1] 1.16 1.30
Let’s see if we can replicate {eefAnalytics}. Here’s the conditional effect size:
conditional_eef_style <- g_mlm(
lme_num,
p_const = c(0,1,0),
mod_denom = lme_num,
r_const = c(1,1),
infotype = "expected",
separate_variances = FALSE
)
conditional_eef_style
est se
unadjusted effect size 1.943 0.064
adjusted effect size 1.943 0.064
degree of freedom 1770.482
stopifnot(round(conditional_eef_style$delta_AB, 2) == round(too_big_es, 2))
1.9433481 matches {eefAnalytics}’s 1.94 to two decimal places.
And the unconditional effect size:
unconditional_eef_style <- g_mlm(
lme_num,
p_const = c(0,1,0),
mod_denom = lme_empty,
r_const = c(1,1),
infotype = "expected",
separate_variances = FALSE
)
unconditional_eef_style
est se
unadjusted effect size 1.047 0.032
adjusted effect size 1.047 0.032
degree of freedom 4077.371
stopifnot(round(unconditional_eef_style$g_AB, 2) == round(too_small_es, 2))
1.0472727 again matches {eefAnalytics}’s 1.05 to two decimal places.
The CIs:
CI_g(conditional_eef_style) |> round(2)
[1] 1.82 2.07
CI_g(unconditional_eef_style) |> round(2)
[1] 0.99 1.11
The CI matches for the conditional total but not the unconditional total. Here are the corresponding estimates again from {eefAnalytics}:
eef_wrong$ES$treat1["Total", c("95% LB", "95% UB")]
eef_wrong$Unconditional$ES$treat1["Total", c("95% LB", "95% UB")]
beep(2) # Wakey wakey
sessionInfo()
R version 4.4.1 (2024-06-14 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 22631)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.utf8 LC_CTYPE=English_United Kingdom.utf8
[3] LC_MONETARY=English_United Kingdom.utf8 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.utf8
time zone: Europe/London
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] nlme_3.1-164 eefAnalytics_1.1.1 lmeInfo_0.3.2 beepr_2.0
[5] tictoc_1.2.1 lme4_1.1-35.5 Matrix_1.7-0 lubridate_1.9.3
[9] forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4 purrr_1.0.2
[13] readr_2.1.5 tidyr_1.3.1 tibble_3.2.1 ggplot2_3.5.1
[17] tidyverse_2.0.0
loaded via a namespace (and not attached):
[1] gridExtra_2.3 inline_0.3.19 rlang_1.1.4 magrittr_2.0.3
[5] matrixStats_1.3.0 compiler_4.4.1 loo_2.8.0 vctrs_0.6.5
[9] reshape2_1.4.4 pkgconfig_2.0.3 fastmap_1.2.0 backports_1.5.0
[13] labeling_0.4.3 utf8_1.2.4 threejs_0.3.3 promises_1.3.0
[17] markdown_1.13 tzdb_0.4.0 nloptr_2.1.1 xfun_0.46
[21] jsonlite_1.8.8 later_1.3.2 parallel_4.4.1 R6_2.5.1
[25] dygraphs_1.1.1.6 stringi_1.8.4 StanHeaders_2.32.10 boot_1.3-30
[29] Rcpp_1.0.13 rstan_2.32.6 knitr_1.48 zoo_1.8-12
[33] audio_0.1-11 base64enc_0.1-3 bayesplot_1.11.1 httpuv_1.6.15
[37] splines_4.4.1 igraph_2.0.3 timechange_0.3.0 tidyselect_1.2.1
[41] rstudioapi_0.16.0 abind_1.4-5 codetools_0.2-20 miniUI_0.1.1.1
[45] curl_5.2.1 pkgbuild_1.4.4 lattice_0.22-6 plyr_1.8.9
[49] shiny_1.9.1 withr_3.0.1 posterior_1.6.0 survival_3.7-0
[53] RcppParallel_5.1.8 xts_0.14.0 pillar_1.9.0 tensorA_0.36.2.1
[57] checkmate_2.3.2 DT_0.33 stats4_4.4.1 shinyjs_2.1.0
[61] distributional_0.4.0 generics_0.1.3 hms_1.1.3 rstantools_2.4.0
[65] munsell_0.5.1 scales_1.3.0 minqa_1.2.8 gtools_3.9.5
[69] xtable_1.8-4 glue_1.7.0 tools_4.4.1 shinystan_2.6.0
[73] colourpicker_1.3.0 mvtnorm_1.2-6 grid_4.4.1 QuickJSR_1.3.1
[77] crosstalk_1.2.1 colorspace_2.1-1 cli_3.6.3 fansi_1.0.6
[81] V8_5.0.0 gtable_0.3.5 digest_0.6.36 farver_2.1.2
[85] htmlwidgets_1.6.4 htmltools_0.5.8.1 lifecycle_1.0.4 mime_0.12
[89] rstanarm_2.32.1 shinythemes_1.2.0 MASS_7.3-60.2