The full content of Annals is available to subscribers

Subscribe/Learn More  >
Research and Reporting Methods |

Random-Effects Meta-analysis of Inconsistent Effects: A Time for Change

John E. Cornell, PhD; Cynthia D. Mulrow, MD, MSc; Russell Localio, PhD; Catharine B. Stack, PhD, MS; Anne R. Meibohm, PhD; Eliseo Guallar, MD, DrPH; and Steven N. Goodman, MD, PhD
[+] Article, Author, and Disclosure Information

From University of Texas Health Science Center at San Antonio, San Antonio, Texas; University of Pennsylvania and American College of Physicians, Philadelphia, Pennsylvania; Johns Hopkins School of Public Health, Baltimore, Maryland; and Stanford University School of Medicine, Stanford, California.

Potential Conflicts of Interest: Disclosures can be viewed at www.acponline.org/authors/icmje/ConflictOfInterestForms.do?msNum=M13-2886.

Requests for Single Reprints: John E. Cornell, PhD, Department of Epidemiology and Biostatistics, University of Texas Health Science Center at San Antonio, 7703 Merton Minter Boulevard, San Antonio, TX 78229; e-mail, cornell@uthscsa.edu.

Current Author Addresses: Dr. Cornell: Department of Epidemiology and Biostatistics, University of Texas Health Science Center at San Antonio, 7703 Merton Minter Boulevard, San Antonio, TX 78229.

Dr. Mulrow: University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Drive, San Antonio, TX 78229.

Dr. Localio: Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania, 635 Blockley Hall, 423 Guardian Drive, Philadelphia, PA 19104-6021.

Drs. Stack and Meibohm: American College of Physicians, 190 N. Independence Mall West, Philadelphia, PA 19106.

Dr. Guallar: Welch Center for Prevention, Epidemiology and Clinical Research, Johns Hopkins Medical Institutions, 2024 East Monument Street, Room 2-645, Baltimore, MD 21287.

Dr. Goodman: Stanford University School of Medicine, 259 Campus Drive, T265 Redwood Building/HRP, Stanford, CA 94305.

Author Contributions: Conception and design: J.E. Cornell, C.D. Mulrow, R. Localio, S.N. Goodman.

Analysis and interpretation of the data: J.E. Cornell, R. Localio.

Drafting of the article: J.E. Cornell, C.D. Mulrow, R. Localio, C.B. Stack, A.R. Meibohm, S.N. Goodman.

Critical revision of the article for important intellectual content: J.E. Cornell, C.D. Mulrow, R. Localio, C.B. Stack, A.R. Meibohm, E. Guallar, S.N. Goodman.

Final approval of the article: J.E. Cornell, C.D. Mulrow, R. Localio, C.B. Stack, A.R. Meibohm, E. Guallar, S.N. Goodman.

Statistical expertise: J.E. Cornell, R. Localio, A.R. Meibohm, S.N. Goodman.

Administrative, technical, or logistic support: C.D. Mulrow.

Collection and assembly of data: J.E. Cornell.

Ann Intern Med. 2014;160(4):267-270. doi:10.7326/M13-2886
Text Size: A A A

This article has an additional interactive example appended as a Supplement. Please visit the Supplement tab on this page to access the presentation.

A primary goal of meta-analysis is to improve the estimation of treatment effects by pooling results of similar studies. This article explains how the most widely used method for pooling heterogeneous studies—the DerSimonian–Laird (DL) estimator—can produce biased estimates with falsely high precision. A classic example is presented to show that use of the DL estimator can lead to erroneous conclusions. Particular problems with the DL estimator are discussed, and several alternative methods for summarizing heterogeneous evidence are presented. The authors support replacing universal use of the DL estimator with analyses based on a critical synthesis that recognizes the uncertainty in the evidence, focuses on describing and explaining the probable sources of variation in the evidence, and uses random-effects estimates that provide more accurate confidence limits than the DL estimator.




Grahic Jump Location

Heterogeneous evidence from Collins and colleagues’ meta-analysis of the effects of diuretics on preeclampsia (11).

* The metafor package in R was used to compute the fixed-effects estimate and the DerSimonian–Laird random-effects estimate.

† The metafor package in R was used to compute the Knapp–Hartung small-sample adjustments, based on the DerSimonian–Laird estimate.

‡ The small-sample (Skovgaard) estimate from the metaLik package in R was used to compute the profile likelihood estimate. The large-sample profile likelihood estimate produced a narrower CI that indicates a statistically significant effect (95% CI, 0.37 to 0.95).

§ The hierarchical Bayesian estimate was computed using WinBugs and assumed a vague uniform (10, 10) prior distribution for τ. A sensitivity analysis assuming a vague γ (0.001, 0.001) on precision (1/τ 2) produced a slightly smaller but statistically significant 95% CI (0.36 to 0.98).

Grahic Jump Location




Citing articles are presented as examples only. In non-demo SCM6 implementation, integration with CrossRef’s "Cited By" API will populate this tab (http://www.crossref.org/citedby.html).


Submit a Comment/Letter
More Cause for Skepticism?
Posted on February 27, 2014
David Lander
Virginia College of Osteopathic Medicine
Conflict of Interest: None Declared
Congratulations and thanks to Cornell et al on their particularly cogent discussion of the technical problems in calculating the “final answer” in meta-analysis. I have struggled to understand and teach these concepts for years, and their explanations are extremely helpful.

My goal in teaching young doctors about meta-analysis is modest: I want them to be savvy readers., and skeptical ones, at that. So, at the risk of oversimplification, I seek useful generalizations. In critically reading a meta-analysis, I am a proponent of the use of common sense, and suggest that the reader can apply their own judgment, to some extent, particularly on the issue of qualitative heterogeneity. When it comes to quantitative heterogeneity and the calculation of the summary numbers, I am frankly happy if the student has understood the basic concepts, and merely appreciates that the particular choice of fixed-effects or random-effects modeling technique in a given meta-analysis is a technical issue, and beyond the understanding of most readers.

In this vein, I would like to ask one question of the authors . It would appear from their examples that the various random-effects models produce quite similar point-estimates of the combined effect, and mostly differ in the size of the confidence interval. Would it be fair to consider this as a general principle? Namely, that it is likely that the point-estimate of the summary number will not vary greatly depending upon the method chosen but that the CI will? In other words, should the (non-technical) reader of a meta-analysis generally have more skepticism about the CI and less about the estimated effect value? One implication of such a principle would be that when a meta-analysis concludes that an intervention is valuable, but the CI limit is fairly close to the “no-effect” value, we should be quite skeptical if the older calculation methods, such as the DL, have been used.
Heterogeneity: a pearl in the mud
Posted on March 5, 2014
NaNa Keum, Chung-Cheng Hsieh, Nancy Cook
Harvard School of Public Health
Conflict of Interest: None Declared
Meta-analysis, a statistical synthesis of independent but comparable studies, has been widely used to derive a quantitative summary of the available evidence. Currently, the random-effects model based on the DerSimonian-Laird (DL) estimator (1) is generally the standard approach unless there is a strong a priori scientific belief about the homogeneous nature of the study effect. Since results from meta-analyses have great implications for guiding clinical practice and making public health recommendations, there are important considerations for revisiting meta-analysis methodology.

In a recent issue of the Ann Intern Med, Cornell et al (2) highlight one problem with this method: the DL random-effects model does not account for uncertainty in estimating tau (i.e. between-study heterogeneity), resulting in summary 95% confidence intervals (CIs) that are too narrow. Thus, Cornell et al recommend alternative methods to construct more accurate 95% CIs. Another criticism of the DL method is that the point estimate from the random-effects model gives disproportionally large weights to small, less reliable studies that are likely to be of lower quality (3). With both the point and interval estimate of this current “gold-standard” approach being called into question, how should we redirect the standard practice in this field?

The method presented by Shore et al. (4) provides some direction. The approach adopts a point estimate directly from the fixed-effects model, weighting studies according to precision, but adjusts its 95% CI by inflating the variance by the ratio of χ2 statistics to degrees of freedom from the heterogeneity test, taking into account the between-study variation. An additional alternative is a hybrid of fixed- and random-effects models that presents a point estimate from the fixed-effect model and a 95% CI accounting for the uncertainty in estimating tau as in the Knapp-Hartung or other approaches (2). Simulation studies comparing the Shore et al method with such new hybrid methods are warranted to advance the field of meta-analysis.

Above all, it is important to recognize that the ultimate purpose of meta-analysis is not to derive a single quantitative summary, but to identify sources of heterogeneity if they exist. In the presence of true between-study variation beyond random sampling variation, heterogeneity is not something to be muffled statistically but rather to be discovered. Thus, heterogeneity should not be considered as an obstacle in obtaining a single quantitative summary, but rather as an opportunity to find a subgroup of people who might benefit most from the findings.

1 DerSimonian R, Laird N. Meta-analysis in clinical trials. Control Clin Trials. 1986; 7: 177-88.
2 Cornell JE, Mulrow CD, Localio R, Stack CB, Meibohm AR, Guallar E, Goodman SN. Random-effects meta-analysis of inconsistent effects: a time for change. Ann Intern Med. 2014;160:267-70.
3 Peto R, Awasthi S, Read S, Clark S, Bundy D. Vitamin A supplementation in Indian children - Authors' reply. Lancet. 2013; 382: 594-6.
4 Shore RE, Gardner MJ, Pannett B. Ethylene oxide: an assessment of the epidemiological evidence on carcinogenicity. Br J Ind Med. 1993; 50: 971-97.
Author's Response
Posted on June 20, 2014
John Cornell, PhD, Cynthia D. Mulrow, MD, MSc
University of Texas Health Science Center
Conflict of Interest: None Declared
Random effect estimates: A tale of point estimates and confidence intervals
Statistical heterogeneity increases our uncertainty about the magnitude, expected clinical variation, and clinical informativeness of the available evidence for the relative benefits and harms of a medical intervention. Random-effect models estimate the additional uncertainty we have with respect to the expected clinical variation in the relative effectiveness of a treatment. Quantitatively, the added uncertainty increases the variance and produces a wider confidence interval. The comments and questions by Lander and by Keum, Hsieh, and Cook focus on a less discussed issue in meta-analysis: Differences in the point-estimate for the pooled treatment effect between the fixed and random effects models.
Random effect models reweight the individual study estimates, so that the results from smaller studies have a greater influence on the overall estimate. In general, the various random effect models produce fairly similar estimates for the overall odds ratio that are closer to the estimates from the smaller, less precise studies.
As pointed out by Keum, Hsih and Cook, random-effect models are susceptible to small study effects. So, the less precise and perhaps lower quality trials exert undue influence on the pooled treatment effect. They suggest we consider a hybrid two step approach proposed by Armitage (1). A fixed effects model is used to estimate the overall odds ratio. A heterogeneity-adjusted variance is used to estimate the 95% confidence interval. It is an interesting, ad hoc method that shares some features of the Knapp-Hartung approach. Applying this method to the pre-eclampsia data, the point and interval estimates for the overall odds ratio are OR = 0.67 [95% CI 0.49 to 0.92]. The confidence interval estimates clearly fall somewhere between the DerSimonian-Laird and other random effect confidence intervals.
Small-study effects on the point-estimate are amplified when the smaller, less precise studies yield odds ratios that are further away from the null. The greater the asymmetry in the funnel plot the greater the difference between the fixed and random effect estimates. There is some evidence of asymmetry in the funnel plot for the pre-eclampsia example, though there is little statistical evidence for “publication bias” (Harbord’s regression test, p = 0.564). None the less, it is clear that giving proportionally more weight to the smaller studies can produce odds ratios that suggests a greater benefit for diuretics than does the fixed effect estimate. In this sense the random effect estimate appears less conservative than the fixed effects estimate, though, in this case, the 95% confidence interval is considerably wider and includes 1.0.
The simplicity of Armitage’s hybrid approach is appealing, but clearly we need additional simulation studies to assess whether it provides an adequate accounting for the uncertainty. It is also important to consider the conditions that produce discrepant estimates for the odds ratio. Risk of bias, degree of statistical heterogeneity and small-study effects all need to be carefully weighed when selecting the most appropriate statistical model to combine the evidence. Keep in mind, however, that the best decision may be to refrain for pooling, provide a critical evaluation of the limitations in the available evidence, and focus on the best available evidence produced by the few larger clinically and methodologically robust trials.
Armitage P. Statistical Considerations: conclusion. In: Interpretation of negative epidemiological evidence for carcinogenicity. NJ Wald, R Doll, eds. International Agency for Research on Cancer, Lyon 1985: 190 (IARC Publ No 65.)
Submit a Comment/Letter

Summary for Patients

Clinical Slide Sets

Terms of Use

The In the Clinic® slide sets are owned and copyrighted by the American College of Physicians (ACP). All text, graphics, trademarks, and other intellectual property incorporated into the slide sets remain the sole and exclusive property of the ACP. The slide sets may be used only by the person who downloads or purchases them and only for the purpose of presenting them during not-for-profit educational activities. Users may incorporate the entire slide set or selected individual slides into their own teaching presentations but may not alter the content of the slides in any way or remove the ACP copyright notice. Users may make print copies for use as hand-outs for the audience the user is personally addressing but may not otherwise reproduce or distribute the slides by any means or media, including but not limited to sending them as e-mail attachments, posting them on Internet or Intranet sites, publishing them in meeting proceedings, or making them available for sale or distribution in any unauthorized form, without the express written permission of the ACP. Unauthorized use of the In the Clinic slide sets will constitute copyright infringement.


Buy Now for $32.00

to gain full access to the content and tools.

Want to Subscribe?

Learn more about subscription options

Related Articles
Topic Collections
PubMed Articles
Forgot your password?
Enter your username and email address. We'll send you a reminder to the email address on record.