In this section, we will discuss a number of meta-analytic techniques. I will demonstrate how to perform each in R as we go along. We will primarily be using the metafor package to perform these exercises.



How can we synthesize experimental research findings?


The traditional form of conducting a research synthesis is performing a iterature review. However, there are now also a number of procedural and statistical techniques for synthesizing results of studies. In experiments, this is most commonly done by:


1.) Replicating experiments (i.e. conduct the same experiment holding treatment and outcomes constant) with different subjects, ideally drawn from the same population.

2.) Pooling the findings from these experiments into a single across-study treatment effect. The resulting increase in sample size will also result in an increase in precision.


Why conduct meta-analysis?


From Gerber and Green: “The attraction of meta-analysis is that a series of small experiments may each be unable to speak to a hypothesis with precision, but when pooled together, these experiments may suggest a clear conclusion.”


Meta-analysis began in the 1980s as a way to synthesize educational and psychological research, but has since expanded, particularly in the medical and social sciences.

In medicine, the Cochrane Collaboration was started in 1993, and today contains thousands of systematic reviews of medical interventions. This kind of research synthesis is considered by many to be the gold standard for determining the effectiveness of different health care interventions.


Examples of meta-analysis in political science:

  1. The Metaketa Initiative: “A collaborative research model aimed at improving the accumulation of knowledge from field experiments on topics where academic researchers and policy practitioners share substantive interests.” The Metaketa Initiative has completed or is in the process of conducting coordinated field experiments that are conducive to meta-analysis on the topics of: (1) Information and accountability, (2) taxation, (3) natural resource governmence, (4) community policing, and (5) women’s action committees and local services.
  2. Costa, Mia. “How responsive are political elites? A meta-analysis of experiments on public officials.” Journal of Experimental Political Science 4.3 (2017): 241-254.
  3. Doucouliagos, Hristos, and Mehmet Ali Ulubaşoğlu. “Democracy and economic growth: a meta‐analysis.” American Journal of Political Science 52.1 (2008): 61-83.
  4. Dunning, Thad, et al. “Voter information campaigns and political accountability: Cumulative findings from a preregistered meta-analysis of coordinated trials.” Science Advances 5.7 (2019): eaaw2612.
  5. Kalla, Joshua L., and David E. Broockman. “The minimal persuasive effects of campaign contact in general elections: Evidence from 49 field experiments.” American Political Science Review 112.1 (2018): 148-166.
  6. Lau, Richard R., et al. “The effects of negative political advertisements: A meta-analytic assessment.” American Political Science Review 93.4 (1999): 851-875.
  7. Many more that I’m missing due to my own selection bias.


Dangers of meta-analysis


  • Assumptions about subjects: Are subjects from different studies really participants in the same grand experiment? Meta-analysis is most convincing when (1) subjects are drawn randomly from the same population or (2) we believe there is very little treatment effect heterogenity.


  • Selection bias and publication bias: Are we confident that we have found all of the relevant studies? What if the studies we are able to find are correlated with the size of the treatment effects? A common example of this is publication bias, or when null results are less likely to be published than studies with large and significant treatment effects. Aggregating results from published studies only can therefore exaggerate our estimate of the “true” treatment effect.

Fixed versus random effects meta-analysis



Fixed effects meta-analysis

\[y_i = \theta_i + e_i\] Where \(y_i\) denotes the observed effect in the \(i\)th study, \(\theta_i\) is the corresponding (unknown) true effect, \(e_i\) is the sampling error, and \(e_i ∼ N(0,v_i)\).

A fixed effect model tells us: how large is the average true effect in the set of \(k\) studies included in the meta-analysis?

This is typically estimated using weighted least squares, where:

\[\hat{\theta} = \frac{\sum_{i=1}^k \: w_i \: \: \theta_i \:}{\sum_{i=1}^k \: w_i \:}\]

where \(w_i\) represents the weight assigned to each study. These weights are typically equal to \(w_i = \frac{1}{v_i}\), or the inverse of the within-study variance (the square of the standard error) of the estimated effect. Note that the variance is inversely proportional to within-study sample size (because \(v = \frac{\sum (x_i - \bar{x})^2}{n - 1}\)). Therefore, the larger the sample, the smaller the variance, so the more precise the estimate of effect size should be. Hence, larger weights are assigned to effect sizes from studies that have larger within-study sample sizes.

When all observed effect size indicators (\(\theta_i\)) estimate a single population parameter, as is hypothesized under a fixed effects model, then \(\hat{\theta}\) is an unbiased estimate of the population parameter􏰅.


Random effects meta-analysis

Same as above, but now \(\theta_i\) is not fixed, but is itself random and has its own distribution:

\[\theta_i = \mu + \mu_i\] where \(\mu_i ∼ N(0,\tau^2)\). \(\mu_i\) can be thought of as the between-studies variance. Therefore, the true effects are assumed to be normally distributed with mean \(\mu\) and variance \(\tau^2\).


\(\hat{\theta}\) is still estimated as:

\[\hat{\theta} = \frac{\sum_{i=1}^k \: w_i \: \: \theta_i \:}{\sum_{i=1}^k \: w_i \:}\]

But now \(w_i = \frac{1}{v_i + \hat{\tau^2}}\), where \(\hat{\tau^2}\) is an estimate of \(\tau^2\).

This implies that random effects models follow a two-stage process: (1) estimate the amount of heterogenity \(\tau^2\) using one of a number of proposed estimators, and (2) estimate \(\mu\) using WLS.

The true effects are therefore assumed to be normally distributed with mean \(\mu\) and variance \(\tau^2\). The goal is then to estimate \(\mu\), the average true effect and \(\tau^2\), the (total) amount of heterogeneity among the true effects. If \(\tau^2 = 0\), then this implies homogeneity among the true effects (i.e., \(\theta_1 = . . . = \theta_k ≡ \theta\)), so that \(\mu = \theta\) then denotes the true effect.

What does this last statement imply about the difference between fixed and random effects meta-analysis under treatment effect homogeneity?

  • Answer: The two models will provide the same estimate since \(\mu = \theta\).


Since the variation under random effects incorporates the same error as fixed effects plus an additional component, it cannot be less than the variation under the fixed effect model. As long as the between-studies variation is non-zero, the variance, standard error, and confidence interval will therefore always be larger under random effects.


Publication bias


A robust literature explores how to detect and correct for publication bias in meta-analysis. Which method is the best in each circumstance remains a subject of active debate.

The following tests have been proposed to detect publication bias:


P-curve example


Asymetric funnel plot


Symetric funnel plot


The following estimates have been proposed to correct for publication bias:



Trim-and-fill


# PET-PEESE function
petpeese <- function(dataset) {
  pet = lm(ate ~ se, weight = 1/var, data = dataset)
  peese = lm(ate ~ var, weight = 1/var, data = dataset) 
  int_pet = pet$coefficients[1]
  se_pet = summary(pet)$coefficients[1,2]
  int_peese = peese$coefficients[1]
  se_peese = summary(peese)$coefficients[1,2]
  p_pet = summary(pet)$coefficients[1,4]
  petpeese_int = ifelse(p_pet > .05, int_pet, int_peese)
  petpeese_se = ifelse(p_pet > .05, se_pet, se_peese)
  return(c(petpeese_int, petpeese_se))
}


Meta-regression (also referred to as moderator analysis)


Part of the heterogenity in a study may be due to the influence of “moderators.” For example, in a medical trial, results might depend on important differences in subjects (e.g. gender, a pre-existing condition, etc.). If these “covariates” are known, we can account for them in our meta-analysis.

In practice, we will then get a coefficient on this variable reflecting how much it is associated with the variation in results across studies. We can also recover an estimate of how much of the “total heterogenity” across studies we have “accounted for” by including this covariate.

Let’s look at an example in R.


Meta-analysis as your replication project