This wrapper function allows to simultaneously pool effect sizes using different meta-analytic approaches.
Usage
runMetaAnalysis(data,
# Models to run
which.run = c("overall", "combined",
"lowest.highest", "outliers",
"influence", "rob", "threelevel",
"threelevel.che"),
# Effect size measure
es.measure = c("g", "RR"),
es.type = c("precalculated", "raw"),
es.var = ifelse(identical(es.measure[1], "g"),
".g", ".log_rr"),
se.var = ifelse(identical(es.measure[1], "g"),
".g_se", ".log_rr_se"),
es.binary.raw.vars =
c(".event_arm1", ".event_arm2",
".totaln_arm1", ".totaln_arm2"),
# Estimator of the heterogeneity variance
method.tau = "REML",
method.tau.ci = "Q-Profile",
i2.ci.boot = FALSE,
nsim.boot = 5e3,
hakn = TRUE,
# Data specifications
study.var = "study",
arm.var.1 = "condition_arm1",
arm.var.2 = "condition_arm2",
measure.var = "instrument",
low.rob.filter = "rob > 2",
round.digits = 2,
rob.data = NULL,
# Model specifications
which.combine = c("arms", "studies"),
which.combine.var = "multi_arm1",
which.outliers = c("overall", "combined"),
which.influence = c("overall", "combined"),
which.rob = c("overall", "combined"),
nnt.cer = 0.2,
rho.within.study = 0.6,
phi.within.study = 0.9,
w1.var = ifelse(identical(es.measure[1], "g"),
"n_arm1", "totaln_arm1"),
w2.var = ifelse(identical(es.measure[1], "g"),
"n_arm2", "totaln_arm2"),
time.var = "time_weeks",
vcov = c("simple", "complex"),
near.pd = FALSE,
use.rve = TRUE,
# Output
html = TRUE,
# Additional arguments
...)
Arguments
- data
data.frame
. Effect size data in the wide format, as created bycalculateEffectSizes
.- which.run
character
. Selection of models to be calculated. See 'Details'.- es.measure
character
. Should meta-analyses be calculated using the bias-corrected standardized mean difference ("g"
; default), or using risk ratios ("RR"
)? Meta-analyses will only be conducted using comparisons that contain nonNA
values in thees.var
andse.var
columns.- es.type
character
. Should pre-calculated or raw event data (i.e. the Mantel-Haenszel method) be used for meta-analyses of risk ratios? Can be set to"precalculated"
(default) or"raw"
.- es.var
character
. Specifies the name of the variable containing the (pre- calculated) effect size data indata
. Whenes.measure = "g"
, this is set to.g
by default;".log_rr"
is used whenes.measure = "RR"
. The default settings correspond with the standard output ofcalculateEffectSizes()
.- se.var
character
. Specifies the name of the variable containing the (pre-calculated) standard errors (square root of the variance) of the effect size metric defined ines.var
. Ifes.measure = "g"
, this is automatically set to.g_se
; ifes.measure = "RR"
,".log_rr_se"
is used. The default settings correspond with the standard output ofcalculateEffectSizes()
.- es.binary.raw.vars
character
vector, defining the column names in which the (1) raw event counts in the experimental group, (2) raw event counts in the control/reference group, (3) sample size in the experimental group, and (4) sample size of the control/reference group are stored. Defaults correspond with the standard output ofcalculateEffectSizes()
.- method.tau
character
. A character string indicating which method is used to estimate the between-study variance (tau-squared) and its square root (tau). Either"REML"
(default),"DL"
,"PM"
,"ML"
,"HS"
,"SJ"
,"HE"
, or"EB"
, can be abbreviated (seemetagen
). Use"FE"
to use a fixed-effect/"common effect" model.- method.tau.ci
character
. A character string indicating which method is used to estimate the confidence interval of the between-study heterogeneity variance \(\tau^2\). Either"Q-Profile"
(default and recommended; Viechtbauer, 2017),"BJ"
,"J"
, or"PL"
can be abbreviated. Seemetagen
andrma.uni
for details.- i2.ci.boot
logical
. Confidence intervals for \(\tau^2\) as calculated by the Q-Profile method are not directly applicable for three-level models, in which two heterogeneity variance components are estimated. By default, this argument is therefore set toFALSE
, and no confidence intervals around \(\tau^2\) and \(I^2\) are provided for the"threelevel"
and"threelevel.che"
model. If this argument is set toTRUE
, parametric bootstrapping is used to calculate confidence intervals around the between- and within-study heterogeneity estimates (\(\tau\) and \(I^2\)). Please note that this can take several minutes, depending on the number of effect sizes. IfcorrectPublicationBias()
is used andi2.ci.boot
isTRUE
, bootstrapping will also be used to calculate confidence intervals around the \(G^2\) statistic (Rücker et al., 2011) used in the limit meta-analysis (note that \(G^2\) is printed as \(I^2\) in this package).- nsim.boot
numeric
Number of bootstrap samples to be drawn wheni2.ci.boot
isTRUE
. Defaults to 5000.- hakn
logical
. Should the Knapp-Hartung adjustment for effect size significance tests be used? Default isTRUE
.- study.var
character
. The name of the variable indata
in which the study IDs are stored.- arm.var.1
character
. The name of the variable indata
in which the condition (e.g. "guided iCBT") of the first arm within a comparison are stored.- arm.var.2
character
. The name of the variable indata
in which the condition (e.g. "wlc") of the second arm within a comparison are stored.- measure.var
character
. The name of the variable indata
in which the instrument used for the comparison is stored.- low.rob.filter
character
. A filtering statement by which to include studies for the "low RoB only" analysis. Please note that the name of the variable must be included as a column indata
.- round.digits
numeric
. Number of digits to round the (presented) results by. Default is2
.- rob.data
list
. Optional list detailing how risk of bias data should be appended to the model (see Details).- which.combine
character
. Should multiple effect sizes within one study be pooled on an"arms"
(default) or"studies"
level? When a study is a multi-arm trial, settingwhich.combine = "arms"
will aggregate the effect sizes for each trial arm individually before pooling; thewhich.combine.var
argument can be used to control which effects within a study should be aggregated. Whenwhich.combine = "studies"
, one overall aggregated effect is created for each study. This setting is preferable from a statistical perspective, since it ensures that all pooled effects can be assumed to be independent.- which.combine.var
character
. Additional grouping variable within studies to be used for the"combined"
analysis whenwhich.combine = "arms"
. If the specified variable differs within one study (as defined bystudy.var
), effects will be aggregated separately for each unique value inwhich.combine.var
. Defaults to"multi_arm1"
, the variable which encodes multi-arm intervention conditions in the Metapsy data standard.- which.outliers
character
. Which model should be used to conduct outlier analyses? Must be"overall"
or"combined"
, with"overall"
being the default.- which.influence
character
. Which model should be used to conduct influence analyses? Must be"overall"
or"combined"
, with"overall"
being the default.- which.rob
character
. Which model should be used to conduct the "low risk of bias only" analyses? Must be"overall"
or"combined"
, with"overall"
being the default.- nnt.cer
numeric
. Value between 0 and 1, indicating the assumed control group event rate to be used for calculating NNTs via the Furukawa-Leucht method.- rho.within.study
numeric
. Value between 0 and 1, indicating the assumed correlation of effect sizes within studies. This is relevant to combine effect sizes for the"combined"
analysis type, and used to estimate the variance-covariance matrices needed for the conditional and hierarchical effects three-level model. Default is0.6
.- phi.within.study
numeric
. Value between 0 and 1, indicating the assumed one-week autocorrelation of effect sizes. This is only used whenvcov="complex"
to approximate the variance-covariance matrices needed for the"combined"
and"threelevel.che"
model. Default is 0.9. See "Details".- w1.var
character
. Name of the variable indata
in which the sample sizes of the first arm are stored. See "Details".- w2.var
character
. Name of the variable indata
in which the sample sizes of the second arm are stored. See "Details".- time.var
character
. Name of the variable indata
in which the assessment time point is stored. Should be expressed as weeks since randomization; other units (e.g. days, months) are also possible ifphi.within.study
is specified accordingly. See "Details".- vcov
character
. For the"combined"
and"threelevel.che"
model, should variance-covariance matrices (representing the dependency structure of the data) be approximated using a heterogeneous compound symmetry ("simple"
; default) or unstructured matrix structure ("complex"
)? See "Details".- near.pd
logical
. If at least one of the study variance-covariance matrices constructed whenvcov="complex"
is not positive definite/invertible, should thenearPD
function be used to compute the nearest positive definite matrix? Default isFALSE
.- use.rve
logical
. Should robust variance estimation be used to calculate confidence intervals and tests of three-level models?TRUE
by default.- html
logical
. Should an HTML table be created for the results? Default isTRUE
.- ...
Additional arguments.
Value
Returns an object of class "runMetaAnalysis"
. This object includes, among other things,
a data.frame
with the name summary
, in which all results are summarized - including the
studies which were removed for some analysis steps. Other objects are the "raw" model objects returned
by all selected analysis types. This allows to conduct further operations on some models specifically (e.g.
run a meta-regression by plugging one of the model objects in meta:::update.meta
.
Details
The runMetaAnalysis
function is a wrapper for several types of meta-analytic models
that are typically used. It allows to run all of these models in one step in order to generate results
that are somewhat closer to being "publication-ready".
By default, the following models are calculated:
"overall"
. Runs a generic inverse-variance (random-effects) model. All included effect sizes are treated as independent. Whenes.measure = "RR"
andes.type = "raw"
, the Mantel-Haenszel method is used for pooling instead."combined"
. Pools all effect sizes within one study (defined bystudy.var
) before pooling. This ensures that all effect sizes are independent (i.e., unit-of-analysis error & double-counting is avoided). To combine the effects, one has to assume a correlation of effect sizes within studies, empirical estimates of which are typically not available."lowest.highest"
. Runs a meta-analysis, but with only (i) the lowest and (ii) highest effect size within each study included."outlier"
. Runs a meta-analysis without statistical outliers (i.e. effect sizes for which the confidence interval does not overlap with the confidence intervall of the overall effect)."influence"
. Runs a meta-analysis without influential cases (seeinfluence.rma.uni
for details)."rob"
. Runs a meta-analysis with only low-RoB studies included."threelevel"
. Runs a multilevel (three-level) meta-analysis model, with effect sizes nested in studies."threelevel.che"
. Runs a multilevel (three-level) meta-analysis model, with effect sizes nested in studies. Variance-covariance matrices of each study with two or more effect sizes are estimated usingrho.within.study
as the assumed overall within-study correlation. This imputation allows to run a "correlated and hierarchical effects" (CHE) model, which is typically a good approximation for data sets with unknown and/or complex dependence structures.
Internally, the overall
, combined
, lowest.highest
, outlier
, influence
and rob
models are fitted by calling the meta::metagen()
or meta::metabin()
function,
respectively, in {meta} (Balduzzi, Rücker & Schwarzer, 2019).
The threelevel
and threelevel.che
models are implemented using metafor::rma.mv()
in {metafor} (Viechtbauer, 2005).
Outlier selection is implemented using the dmetar::find.outliers()
function,
and influence analyses using the dmetar::InfluenceAnalysis()
function. The latter function is a wrapper for
metafor::influence.rma.uni()
.
\(~\)
Simple or complex variance-covariance approximation
The vcov
argument controls if the effect size dependencies within the data
should be approximated using a "simple"
(default) or more "complex"
(but potentially more accurate)
method. This argument is only relevant for the "combined"
and "threelevel.che"
models. The default
"simple" method constructs variance-covariance matrices \(\Sigma_k\) for each study using a
constant sampling correlation \(\rho\) (defined by rho.within.study
), which is identical across all studies, outcomes, and time points.
This simplifying assumption is part of the formulation of the CHE model originally provided by
Pustejovsky and Tipton (2022).
Naturally, employing a common value of \(\rho\) across all studies may not be reasonable
in some analyses, and other information may be available to better approximate the effect size dependencies
in the collected data. Setting vcov
to "complex"
allows to assume that correlations between effect
sizes may differ conditional on the type of dependency. This means that the variance-covariance matrix \(\Sigma_k\) of some study \(k\) is
approximated by an unstructured matrix with varying \(\rho_{ij}\) (instead of a
heterogeneous compound symmetry matrix with fixed \(\rho\), as is used when vcov="simple"
).
\[\begin{array}{ccc}\texttt{vcov="simple"} & \texttt{vcov="complex"} & \\ \Sigma_k = \begin{bmatrix} \sigma^2_1 \\ \rho \sigma_2 \sigma_1 & \sigma^2_2 & & \\ \rho \sigma_3 \sigma_1 & \rho \sigma_3 \sigma_2 & \sigma^2_3 & \\ \rho \sigma_4 \sigma_1 & \rho \sigma_4 \sigma_2 & \rho \sigma_4 \sigma_3 & \sigma^2_4 \end{bmatrix} & \Sigma_k = \begin{bmatrix} \sigma^2_1 & & & \\ \rho_{21} \sigma_2 \sigma_1 & \sigma^2_2 & & \\ \rho_{31} \sigma_3 \sigma_1 & \rho_{32} \sigma_3 \sigma_2 & \sigma^2_3 & \\ \rho_{41} \sigma_4 \sigma_1 & \rho_{42} \sigma_4 \sigma_2 & \rho_{43} \sigma_4 \sigma_3 & \sigma^2_4 \end{bmatrix} & \end{array}\]
For example, setting vcov = "complex"
allows to additionally incorporate assumed correlations specific to multiple testing over time
(e.g. correlations between effects at post-test and long-term follow-up). The value provided in
phi.within.study
represents the (auto-)correlation coefficient \(\phi\), which serves
as a rough estimate of the re-test correlation after 1 week. When a vector of follow-up lengths
is provided in time.var
, this allows to model the gradual decrease in correlation between measurements
over time. Furthermore, it is possible to calculate a correlation coefficient \(\rho_w\) for
multi-arm trials, which is directly proportional to the size of each individual trial arm. When all trial
arms have the same size, meaning that each arm's weight \(w\) is identical,
\(\rho_w\) is known to be 0.5. Multiarm weights \(w\) (and thus
\(\rho_w\)) can be derived if the w1.var
and w2.var
variables,
containing the sample size of each study arm, are provided.
Using the complex approximation method increases the risk that at least one studies' \(\Sigma_k\) matrix is not positive definite. In this case, the function automatically switches back to the constant sampling correlation approximation.
\(~\)
Replacement functions
Once a model has been fitted using runMetaAnalysis
, replacement functions
are defined for each function argument. This allows to quickly tweak one or
more analysis settings, which are implemented once the rerun
function is called.
Say that we saved the results of runMetaAnalysis
in an object m
. If, for example, we want
to check the results using a different estimator of \(\tau^2\), leaving
all other settings the same, we could run e.g. method.tau(m) <- "PM"
, followed by
rerun(m)
. This would provide results using the Paule-Mandel estimator. A list of all available
setting replacement functions is provided here.
\(~\)
Risk of Bias data
Using the rob.data
argument, it is possible to specify which variables in the data set provided in
data
contain Risk of Bias (ROB) assessment information. If specified, this allows to generate
forest plots with added ROB information when using plot.runMetaAnalysis()
. ROB information added this
way can also be used to create ROB summary plots using createRobSummary()
.
The object provided to rob.data
must be a list
element with these elements:
domains
: A vector of characters, specifying variables that contain ratings for different ROB domains (e.g. allocation concealment, missing data handling, ...).domain.names
(Optional): A vector of characters with the same length asdomains
, which provides long-format labels for each included domain.overall.rob
(Optional): A single character specifying the variable indata
that contains the overall ROB rating.categories
: A vector of characters, specifying how ROB ratings were coded in the selected variables (e.g.,"low
,"high"
,"unclear"
).symbols
(Optional): A vector of single-letter characters (or symbols) that should be used when plotting ROB rating in the forest plot.colors
(Optional): A vector of characters of the same length ascategories
, specifying colors for each rating code.
For concrete usage examples with this functionality, see "Examples".
For more details see the Get Started vignette.
Author
Mathias Harrer mathias.h.harrer@gmail.com, Paula Kuper paula.r.kuper@gmail.com, Pim Cuijpers p.cuijpers@vu.nl
Examples
if (FALSE) {
data("depressionPsyCtr")
library(meta)
depressionPsyCtr %>%
checkDataFormat() %>%
checkConflicts() %>%
calculateEffectSizes() %>%
filterPoolingData(condition_arm2 %in%
c("wl", "other ctr")) -> data
# Run the meta-analyses
runMetaAnalysis(data) -> res
# Use replacement function to show results for
# differing settings
method.tau(res) <- "PM"
hakn(res) <- FALSE
rerun(res)
# Show summary
res
# Show forest plot (by default, "overall" is used)
plot(res)
# Show forest plot of specific analysis
plot(res, "outliers")
plot(res, "threelevel")
plot(res, "baujat")
plot(res, "influence")
plot(res, "lowest.highest")
# Extract specific model and do further calculations
# (e.g. meta-regression on 'year')
metaRegression(res$model.overall, ~ scale(year))
# Conduct a subgroup analysis
subgroupAnalysis(res, country)
# Correct for publication bias/small-study effects
correctPublicationBias(res)
# For the combined analysis, set which.combine to
# "studies" here, so that all effects in a study are aggregated
# first before pooling
data %>%
runMetaAnalysis(which.combine = "studies") %>%
plot("combined")
# Define ROB data to be added to the models
robData <- list(
# Names of ROB variables included in 'data'
domains = c("sg", "ac", "ba", "itt"),
# Long-format labels for each ROB domain
domain.names = c("Sequence Generation",
"Allocation Concealment",
"Blinding of Assessors",
"ITT Analyses"),
# Codes used to rate the risk of bias (sr=self-report)
categories = c("0", "1", "sr"),
# Symbols that should be used for these codes in forest plots
symbols = c("-", "+", "s"),
# Colors to be used in forest plots for each of these codes
colors = c("red", "green", "yellow"))
# Re-run model with appended ROB data
res <- runMetaAnalysis(data, rob.data = robData)
# Generate forest plot with ROB data
plot(res, "combined")
# Create a summary plot
createRobSummary(res,
name.low = "1",
name.high = "0",
name.unclear = "sr",
which.run = "combined")
# Run meta-analysis using raw response rate data
data %>%
runMetaAnalysis(es.measure = "RR",
es.type = "raw")
}