Susan L. Norris, MD, MPH, MSc; Lisa Bero, PhD
This article was published at www.annals.org on 20 September 2016.
Disclaimer: Dr. Norris is a staff member of the World Health Organization. Dr. Bero is an employee of the Cochrane Collaboration. The authors alone are responsible for the views expressed in this article, and they do not necessarily represent the views, decisions, or policies of the World Health Organization or the Cochrane Collaboration.
Disclosures: Disclosures can be viewed at www.acponline.org/authors/icmje/ConflictOfInterestForms.do?msNum=M16-1254.
Requests for Single Reprints: Susan L. Norris, MD, MPH, MSc, World Health Organization, Avenue Appia 20, CH-1211, Geneva 27, Switzerland; e-mail, norriss@WHO.int.
Current Author Addresses: Dr. Norris: World Health Organization, Avenue Appia 20, CH-1211, Geneva 27, Switzerland.
Dr. Bero: Charles Perkins Centre Research and Education Hub, University of Sydney, Building D17, 6th Floor, Johns Hopkins Drive, New South Wales 2006, Australia.
Norris S., Bero L.; GRADE Methods for Guideline Development: Time to Evolve?. Ann Intern Med. 2016. doi: 10.7326/M16-1254
Download citation file:
Published: Ann Intern Med. 2016.
The GRADE (Grading of Recommendations Assessment, Development and Evaluation) working group was established in 2000 to develop a “common, sensible and transparent approach to grading quality (or certainty) of evidence and strength of recommendations” (1). The GRADE approach consists of 2 basic determinations. Systematic reviewers or guideline developers make an assessment of the certainty in effect estimates for each outcome that is important or critical for decision making. Guideline developers then translate evidence into recommendations using a framework that encompasses the relative values of the outcomes according to persons affected by the recommendations, preferences about interventions, acceptability, feasibility, effect on equity across subpopulations, resource considerations, and other relevant explicit factors.
The GRADE approach has been adopted (sometimes with modifications) by more than 80 entities, and more than 100 publications about GRADE methods are indexed in PubMed. These accomplishments have been achieved through strong leadership and thoughtful contributions from many clinical and public health scientists. There are, however, several challenges and limitations with GRADE and its implementation.
Although 1 of the key strengths of GRADE is that it makes judgments transparent, implementation of the approach can be challenging. Various groups disagree on how to assess a body of evidence and the meaning and suitability of GRADE terminology (2). This assessment comes down to a single determination: high-, moderate-, low-, or very-low-quality evidence. Guideline panels can view this assessment in an oversimplified or overly precise manner, or they can ignore it altogether and base recommendations on other considerations.
The challenges of implementing GRADE affect its reliability. Although persons with extensive experience using GRADE have “substantial” interrater reliability when assessing the quality of a body of evidence (3), experienced evaluators (using a modified GRADE approach) have low interrater reliability when assessing complex bodies of evidence consisting of different study designs (4). An expert panel achieved only fair interrater agreement on the strength of recommendations based on a body of evidence (5).
The predictive validity of GRADE assessments of the certainty in effect estimates (the degree to which the assessment of high-, moderate-, low- or very-low-quality evidence predicts whether the effect estimate will change with further research) is reported as limited (6). When the quality of evidence was low, the data predicted that future studies had a smaller effect on the estimate of effect than anticipated by GRADE experts, whereas high-quality evidence was influenced more by new data than anticipated.
Although GRADE methods are evolving (7), they are not currently applicable to many questions that guideline developers face, including those about assessing risk and causality, establishing risk thresholds, or assessing animal studies. Further, GRADE does not provide explicit guidance for complex interventions or when the evidence is linked across a causal pathway, and conceptual frameworks are generally absent. There is only limited GRADE guidance on how to assess the quality of a body of evidence addressing resource use.
The GRADE approach is also challenging to apply to different types of data because it was developed for quantitative data with a pooled estimate and CI for each outcome. When data are qualitative or the outcomes cannot be pooled due to heterogeneity, GRADE must be adapted, although the same framework and elements can still be applied. No GRADE guidance is available on how to assess the quality of data from mathematical models or how to incorporate the results of modeling into the development of recommendations.
Several steps can be taken to strengthen the GRADE framework and approach to advance the science supporting guideline development. Further evaluation by the GRADE working group and independent teams that focuses on reliability and validity of the GRADE criteria for assessing the quality of a body of evidence is a priority. Because GRADE is widely applied, there is now a large cohort of systematic reviews that could be used to compare the quality of a body of evidence and its evolution over time using cumulative meta-analysis in the manner performed by Gartlehner and colleagues (6) or through other novel approaches. The reasons for the reported low predictive validity need to be explored.
The GRADE working group should foster greater diversity of thought by encouraging independent development; testing and publishing new, empirically based methods; and welcoming comparisons of GRADE with approaches that operate with different assumptions or frameworks. A survey of guideline developers' questions and needs might be informative. Given the widespread use of GRADE across various disciplines and the need for language translations, collaboration with editors and communications experts would help to improve the precision of terminology and clarity of messaging. Finally, working group members must disclose and be cognizant of their intellectual competing interests with GRADE.
Over the past 15 years, the GRADE approach has changed the way guidelines are developed and presented by providing a standardized framework and methods with broad uptake and implementation. However, despite the many persons involved, publications produced, and organizations that apply this approach, a considerable amount of work remains. Meeting the needs of future guideline developers and the patients and populations that they will ultimately serve requires critical and independent evaluation of GRADE and other approaches, management of intellectual interests, encouragement of critiques of existing approaches and testing of new ideas, and a willingness to recognize deficiencies in methods and address them. Without these changes, GRADE is not sustainable as a leading approach for developing guidelines.
The In the Clinic® slide sets are owned and copyrighted by the American College of Physicians (ACP). All text, graphics, trademarks, and other intellectual property incorporated into the slide sets remain the sole and exclusive property of the ACP. The slide sets may be used only by the person who downloads or purchases them and only for the purpose of presenting them during not-for-profit educational activities. Users may incorporate the entire slide set or selected individual slides into their own teaching presentations but may not alter the content of the slides in any way or remove the ACP copyright notice. Users may make print copies for use as hand-outs for the audience the user is personally addressing but may not otherwise reproduce or distribute the slides by any means or media, including but not limited to sending them as e-mail attachments, posting them on Internet or Intranet sites, publishing them in meeting proceedings, or making them available for sale or distribution in any unauthorized form, without the express written permission of the ACP. Unauthorized use of the In the Clinic slide sets will constitute copyright infringement.
Jan L. Brozek, Nancy Santesso
September 22, 2016
Conflict of Interest:
JLB and NS declare that we have been members of the GRADE Working Group for several years. We have no competing financial interests related to the subject matter but we have potential personal competing interests as we feel that it has been a privilege to take part in stimulating discussions and being able to learn from so many colleagues. We received no funding in support of this work.
GRADE for Guideline Development: Back to the Future
Drs. Norris and Bero declared that the methods of developing guidelines are deficient and should evolve, particularly one specific approach – GRADE (1). In our observation, some clinical experts share similar concerns. Thus, current and prospective guideline panel members may consider the following clarifications informative.It is worth considering that said limitations may not necessarily be specific to any particular method of producing evidence-based guidelines, including the GRADE approach. Indeed, they may reflect the current sad state of human ignorance about reasoning based on limited but complex evidence, being doomed to make decisions under uncertainty, etc. This irritating need of always having to consider probabilities never being entirely certain is likely to bother us forever or at least until we learn to know the future.Humanity’s fanfaronade about its achievements makes it easy to forget that most problems have not been worked out yet, solutions are primitive, not applicable or downright non-existent. It seems obvious that there always will be stuff to improve and any current methods of thinking, including methods of developing guidelines, will continue to evolve. Even the famous theory of relativity from the popular icon of genius – Albert Einstein still has to be further worked on, not to mention the earlier pathetic explanation of gravity offered by Isaac Newton. Despite that, we neither boot them out nor give them a bad press for not being comprehensive or always applicable.The condition of guideline methods may not be so hopeless – most of the steps forward suggested by Drs. Norris and Bero have already been taken by dedicated individuals, research groups and organizations. The GRADE group, whose principles have been inclusiveness, independence and open discussion, alone has attracted over 500 individuals from almost every organization developing guidelines who are currently wrestling with issues of environmental, occupational and public health, animal studies, complex interventions, risk factors, prognosis, diagnosis, qualitative information, values and preferences, health equity, new statistical methods and modeling, care pathways, implementation, performance measurement, and rapid guideline development among others (2-6). The GRADE group is just one – countless other colleagues are cracking similar and different problems. We encourage anyone concerned with developing practice guidelines to embrace inescapable limitations and join forces, so that tomorrow we are a little more enlightened and the guidelines themselves are even more useful. – – –(1) Norris SL and Bero L. GRADE Methods for Guideline Development: Time to Evolve? Ann Intern Med. Published online 20 September 2016 doi:10.7326/M16-1254(2) www.gradeworkinggroup.org(3) Thayer KA, Schünemann HJ. Using GRADE to respond to health questions with different levels of urgency. Environ Int. 2016;92-93:585-9 [PMID: 27126781](4) Iorio A, Spencer FA, Falavigna M, Alba C, Lang E et al. Use of GRADE for assessment of evidence about prognosis: rating confidence in estimates of event rates in broad categories of patients. BMJ. 2015;350:h870. [PMID: 25775931](5) Lewin S, Glenton C, Munthe-Kaas H, Carlsen B, Colvin CJ et al. Using qualitative evidence in decision making for health and social interventions: an approach to assess confidence in findings from qualitative evidence syntheses (GRADE-CERQual). PLoS Med. 2015;12:e1001895. [PMID: 26506244]
Philipp Dahm, Rebecca L. Morgan, Shahnaz Sultan, M. Hassan Murad, Reem A. Mustafa
US GRADE Network
October 1, 2016
Conflict of Interest:
All authors are longstanding members of the GRADE Working Group and founding members of the US GRADE Network. We have received no support for this work and have no financial conflicts of interests.
The Future of GRADE is Bright
We commend Norris and Bero for raising awareness for the major contributions of the GRADE approach to the methodological advancement and dissemination of evidence-based clinical practice guidelines.(1) While we agree that much work and many challenges lie ahead, we disagree on several important issues and arrive at a much more optimistic outlook for the future. First, we question the applicability of the study by Gartlehner et al. on the reproducibility of the GRADE framework.(2) This study focused on the unique approach used by Evidence-Based Practice centers, was based on systematic review authors with unclear training in GRADE methodology and did not employ GRADEpro software (https://gradepro.org). Other studies have shown that GRADE is reproducible when users, even those without expertise, received appropriate training.(3) Second, their article paints a picture of intellectual stagnation and complacency; the truth is the GRADE approach continues to evolve as witnessed by many recent publications including those on the Evidence to Decision (EtD) framework as a major milestone.(4, 5) Within the GRADE Working Group, there are more than 19 active project groups, led by different people with matching expertise, which address such diverse topics as the application of GRADE to qualitative research, rapid guidelines, evidence stemming from animal and modeling studies, and complex interventions. The GRADE Working Group continues to maintain an open, informal structure that allows every individual to join the Group, attend its meetings (held at least twice per year) and engage in the discussion about ongoing work and future directions. Interested parties who wish to become members should contact the GRADE Working Group by email (firstname.lastname@example.org) or through the website (http://www.gradeworkinggroup.org). Lastly, conflict of interest disclosures represents an integral part of the beginning of every GRADE meeting, presentation, and publication. In 2013, we founded the US GRADE Network (http://us.gradeworkinggroup.org) to advance methodological research specifically as it relates to the healthcare environment in this country, to serve as a point of contact for US-based guideline developers, and provide biannual workshops. We formally acknowledge the contributions of numerous individuals and organizations that have enhanced the GRADE approach and welcome an ongoing intellectual exchange. As a results of these efforts, we are hopeful that the work of the GRADE Working Group will continue to serve as a beacon to the many that are invested in the development and dissemination of transparent and methodologically rigorous evidence-based practice guidelines. References1. Norris SL, Bero L. GRADE Methods for Guideline Development: Time to Evolve? Ann Intern Med. 2016.2. Gartlehner G, Dobrescu A, Evans TS, Bann C, Robinson KA, Reston J, et al. The predictive validity of quality of evidence grades for the stability of effect estimates was low: a meta-epidemiological study. J Clin Epidemiol. 2016;70:52-60.3. Kumar A, Miladinovic B, Guyatt GH, Schunemann HJ, Djulbegovic B. GRADE guidelines system is reproducible when instructions are clearly operationalized even among the guidelines panel members with limited experience with GRADE. J Clin Epidemiol. 2016;75:115-8.4. Alonso-Coello P, Schunemann HJ, Moberg J, Brignardello-Petersen R, Akl EA, Davoli M, et al. GRADE Evidence to Decision (EtD) frameworks: a systematic and transparent approach to making well informed healthcare choices. 1: Introduction. BMJ. 2016;353:i2016.5. Schunemann HJ, Mustafa R, Brozek J, Santesso N, Alonso-Coello P, Guyatt G. GRADE Guidelines: 16. Development of the GRADE Evidence to Decision (EtD) frameworks for tests in clinical practice and public health. J Clin Epidemiol. 2016:in press.
Copyright © 2016 American College of Physicians. All Rights Reserved.
Print ISSN: 0003-4819 | Online ISSN: 1539-3704
Conditions of Use
This PDF is available to Subscribers Only