Paul G. Shekelle, MD, PhD; Catherine H. MacLean, MD, PhD; Sally C. Morton, PhD; Neil S. Wenger, MD
Acknowledgments: The authors thank the many RAND investigators whose prior work contributed to the development of the quality assessment methods, in particular, Elizabeth McGlynn, PhD, and the members of the Quality Tools Project. They also thank Rena Hasenfeld for developing and implementing the searches for guidelines and quality indicators.
Grant Support: By a contract from Pfizer Inc. to RAND for this project. Dr. Shekelle is a Senior Research Associate of the Veterans Affairs Health Services Research and Development Service.
Requests for Single Reprints: Paul G. Shekelle, MD, PhD, RAND, 1700 Main Street, Box 2138, Santa Monica, CA 90407-2138; e-mail, firstname.lastname@example.org.
Current Author Addresses: Dr. Shekelle, MacLean, and Morton: 1700 Main Street, Box 2138, Santa Monica, CA 90407-2138.
Dr. Wenger: Department of Medicine, Division of General Internal Medicine and Health Services Research, University of California, Los Angeles, 911 Broxton Plaza, Los Angeles, CA 90095-1736.
Shekelle P., MacLean C., Morton S., Wenger N.; Assessing Care of Vulnerable Elders: Methods for Developing Quality Indicators. Ann Intern Med. 2001;135:647-652. doi: 10.7326/0003-4819-135-8_Part_2-200110161-00003
Download citation file:
Published: Ann Intern Med. 2001;135(8_Part_2):647-652.
Quality of care can be measured by using either processes or outcomes. Each method has its strengths and limitations (1). With the concurrence of the Assessing Care of Vulnerable Elders (ACOVE) Policy Advisory Committee , we chose to assess the care of vulnerable elders by using processes rather than outcomes. We did so because 1) processes are a more efficient measure of quality; 2) for most conditions there are insufficient information in the medical record and a paucity of validated models to adequately adjust outcomes for differences in case mix between providers; and 3) ultimately, processes of care are amenable to direct action by providers.
To be a valid measure of quality, a health care process must be strongly linked to an outcome that is important to patients. Ideally, high-quality published studies would link performance of all such processes to outcomes; however, few health care processes are supported by high-quality evidence (3). Even when a process is supported by strong evidence from randomized clinical trials, the inclusion and exclusion criteria of the clinical trials leave the evidence directly applicable to only a narrow group of patients (4, 5). This is particularly true for vulnerable elders, who are typically excluded from clinical trials (6). Therefore, as we developed the ACOVE quality indicators, we used expert opinion to interpret the available evidence for applicability to vulnerable elders. Our methods entailed a literature review and several levels of expert opinion (Figure), which we explain in detail.
For each ACOVE condition that we selected for quality improvement in vulnerable elders (2), we identified a content expert, who worked as a team with another project member knowledgeable about systematic reviews and quality indicator development. Together, the content expert and project member developed potential quality indicators from existing guidelines, review criteria, and expert opinion.
Because practice guidelines and existing quality indicators are seldom referenced in the traditional scientific databases, we used various search strategies to locate these materials. In addition to searching MEDLINE, we searched the following sources: CONQUEST 1.1 (A Computerized Needs-Oriented Quality Measurement Evaluation System) (7); DEMPAQ: A Project to Develop and Evaluate Methods to Promote Ambulatory Care Quality (8); Directory of Clinical Practice Guidelines(9); Guide to Clinical Preventive Services(10); HEDIS 3.0: Health Plan Employer Data and Information Set (11); National Guideline Clearinghouse (12); National Library of Health Care Indicators(13); and The Medical Outcomes & Guidelines Sourcebook(14).
We also hand searched the tables of contents of all issues of The Journal of the American Medical Association and Medical Care published April through November 1998 for relevant practice guidelines and quality indicators. Furthermore, we requested practice guidelines and quality indicators from the following agencies and organizations: Administration on Aging (AOA); Agency for Health Care Policy and Research (AHCPR)—now known as Agency for Healthcare Research and Quality (AHRQ); Centers for Disease Control and Prevention (CDC); Department of Health and Human Services (DHHS); Department of Veterans Affairs; Foundation for Accountability (FACCT); Health Care Financing Administration (HCFA); HCFA/Connecticut Peer Review Organization; Joint Commission on Accreditation of Healthcare Organizations (JCAHO); National Committee for Quality Assurance (NCQA); and National Institutes of Health (NIH).
By using the quality indicators identified through this process, as well as using expert opinion and existing guidelines, the content expert developed 20 to 30 preliminary quality indicators for further review. Potential indicators were constructed in an IF—THEN—BECAUSE format: IF refers to the clinical characteristics that describe persons eligible for the quality indicator; THEN indicates the actual process that should or should not be performed; and BECAUSE refers to the expected health impact if the indicator is performed. For example, IF a vulnerable elder has heart failure with an ejection fraction of 40% or less, THEN an angiotensin-converting enzyme (ACE) inhibitor should be offered BECAUSE treatment with ACE inhibitors improves longevity.
We circulated this initial set of potential quality indicators to other clinical experts for their review. On the basis of reviewers' comments, we narrowed down the initial set to the 10 to 25 most promising indicators for future development.
Next, we assessed the published evidence supporting a link between the process specified in each quality indicator and patient outcomes. To do so, we performed a systematic review on each quality indicator by using the essence of the Cochrane Collaboration's methods (15)—except that we used a single reviewer to screen and assess studies.
With the assistance of a reference librarian, we electronically searched MEDLINE, EMBASE, The Cochrane Database of Systematic Reviews, HealthSTAR, Ageline, and other specialized databases on a condition-specific basis by using keywords and free-text terms to identify potentially relevant studies. For most conditions, the searches were not restricted by language.
The content expert reviewed the retrieved citations by using a three-step process: First, the titles were reviewed for possibly relevant studies; the abstracts associated with the titles that passed the first round of screening were then reviewed; and, finally, the full articles of abstracts that passed the second round of screening were reviewed. Abstracts and articles were not masked for review. To be accepted, titles and abstracts had to contain information indicating that the full article probably reported evidence on the potential relationship between the process in question and better outcomes in humans. We excluded animal studies, letters, review articles, and other articles that did not report original data.
In reviewing full articles, we gave priority to evidence from studies with the strongest designs that were relevant to the potential quality indicator being examined. In general, this meant that we chose randomized clinical trials for questions about the efficacy or effectiveness of interventions and prospective cohort studies to answer questions about risk or prognosis. We considered such evidence to be direct evidence, and we judged direct evidence in elderly persons to be the strongest level of evidence available.
In the absence of direct evidence in elderly persons, we performed searches for direct evidence in other groups. Indirect evidence—that from less rigorous designs—in elderly or nonelderly persons was reviewed if the available direct evidence was insufficient. When both direct evidence and indirect evidence were lacking, we included the statements of authoritative bodies (for example, specialty societies, National Institutes of Health consensus development conferences, or Agency for Health Care Policy and Research practice guidelines).
After reviewing all of the relevant articles, the content expert prepared a monograph detailing each of the quality indicators and summaries of the evidence supporting them. This monograph was sent to one or more peer reviewers, who were asked to assess the quality of the monograph according to the following guidelines: 1) Is the review complete in terms of both the proposed quality criteria and the evidence? 2) Is the review fair (that is, is the presentation of the evidence unbiased)? The authors subsequently revised the monographs on the basis of reviewers' comments in a manner analogous to the response to a critique of a journal article: Each comment was addressed in turn, and the author provided an appropriate revision or stated the reason why he or she believed that no revision was indicated.
We convened two multidisciplinary groups, each composed of 12 clinical experts, to interpret the supporting evidence detailed in the monographs and to select quality indicators for further consideration. Each panel of experts considered a separate set of ACOVE conditions. Table 1 lists the members of the panels and the conditions they considered.
To assess the expert opinions of the panelists, we used a modified version of the RAND/UCLA Appropriateness Method (16, 17). In brief, the method entails two rounds of anonymous ratings on a risk–benefit scale ranging from 1 to 9 and a face-to-face group discussion between rounds. Each panelist has equal weight in determining the final ratings. The reproducibility of the RAND/UCLA Appropriateness Method is consistent with that of well-accepted diagnostic tests, such as the interpretation of coronary angiography and screening mammography (18). It also has content, construct, and predictive validity (5, 17, 19, 20).
In this application of the method, we sent each panelist the proposed quality indicators and the relevant condition-specific monographs. We asked the panelists to assess the validity of each proposed indicator on a scale of 1 to 9, in which 1 was “definitely not valid” and 9 was ”definitely valid.“ We considered an indicator to be valid if 1) adequate scientific evidence or professional consensus supported a link between the process specified by the indicator and a health benefit to the patient; 2) a physician or health plan with high rates of adherence to the indicator would be considered a higher-quality provider; and 3) the physician or health plan influenced a majority of factors that determine adherence to the indicator (such as smoking cessation).
Each panelist was instructed to rate each potential quality indicator for validity and return the ratings to us before the face-to-face meeting. We prepared summaries of these ratings for distribution at the panel meeting. At the face-to-face panel meeting, each quality indicator was discussed in turn; the discussion focused on the evidence (or lack thereof) supporting or refuting the indicator, and the prior rankings of the panelists. Panelists had in front of them the summary of the panel's first-round ratings and a confidential reminder of their own previous rating. Panelists were encouraged to bring any relevant published information that the monographs had omitted, and in the few cases in which panelists did so, the information was discussed in turn. In several cases, the indicators were reworded or otherwise clarified to better fit clinical judgment.
After the discussion, each indicator was reranked for validity. Analysis of the final-round rankings was similar to that used in past applications of the RAND/UCLA Appropriateness Method (17, 21). We used the median panel rating and measure of dispersion to categorize the validity of the indicators. We considered disagreement among a panel to occur when at least three members of a panel judged the indicator to be in the highest tertile of validity (the indicator received a rating of 7, 8, or 9) and three members rated the indicator to be in the lowest tertile of validity (rating of 1, 2, or 3) (17). All indicators with panel disagreement were rejected.
Because each expert panel considered only a subset of the ACOVE conditions and necessarily focused on individual indicators within those conditions, we sent the entire set of indicators and conditions for yet another round of review, this time focusing on the set of indicators as a whole. We asked our Clinical Committee (2) to review all of the indicators with a median validity rating of 7 or greater (without disagreement) and to assess whether these indicators should be modified or deleted. We also asked the Clinical Committee to briefly reconsider all of the rejected indicators.
The Clinical Committee discussed each indicator in turn at a face-to-face meeting and took explicit yes-or-no votes on any proposed changes to the indicator set. The Clinical Committee made changes to a total of 24 indicators. During the same session, the committee also assessed and ranked the validity of indicators assessing continuity of care; the process for assessing the continuity of care indicators was similar to that used earlier by the two expert panels. (The continuity of care indicators were created at the behest of the ACOVE Policy Advisory Committee after the two expert panel meetings had already taken place.)
The final set of indicators therefore resulted from an initial set that was selected by content experts in the field and subjected to a systematic literature review relating the processes specified in the indicators to patient outcomes. One multidisciplinary group of clinical experts subsequently interpreted the evidence from the literature review for relevance to vulnerable elders, and another such group reviewed the final set of indicators for coherence and content validity.
This process generated 420 potential quality indicators that were presented to two expert panels for validity assessment. The panels and the Clinical Committee accepted 236 of the indicators as valid. Within either panel, members rarely disagreed on the validity of indicators: Panel 1 disagreed on 8 indicators (4.7%), and panel 2 disagreed on 9 indicators (3.8%). Table 2 shows the number of indicators proposed and accepted, according to condition. The number of accepted indicators varied from 6 (for hearing impairment and for falls and mobility disorders) to 17 (for depression).
Table 3 shows the number of accepted indicators by the domain of care covered. Domains were defined as screening, prevention, diagnosis, treatment, follow-up, and continuity. Although the majority of accepted indicators were in the diagnosis and treatment domains, each domain had a minimum of 20 accepted indicators.
Table 4 shows the accepted indicators according to medical intervention. The accepted quality indicators cover 14 categories of intervention, ranging from exercise and the use of assistive devices to more commonly assessed categories, such as physical examinations, medication, and surgery.
Although more than 40% of the proposed indicators were not accepted by the expert panels, this does not imply that these processes do not represent good care. Indicators could be rejected for several reasons, including insufficient evidence linking the proposed indicator with improvements in outcomes; the existence of so many “exceptions to the rule” among vulnerable elders that specifying precisely which patients should or should not be eligible was too difficult; the perception that the data to score the indicator were too difficult to collect; and a desire to keep the number of indicators within manageable limits, meaning some indicators were not included because their expected health benefits were small relative to other accepted indicators.
We present here an explicit method for developing process quality indicators for vulnerable elders based on systematic literature reviews and several levels of expert opinion. Indicators developed with this method covered a range of domains and interventions in medical care. These indicators, which are an explicit product of evidence and opinion, should prove useful for evaluating the care delivered to vulnerable elders.
The In the Clinic® slide sets are owned and copyrighted by the American College of Physicians (ACP). All text, graphics, trademarks, and other intellectual property incorporated into the slide sets remain the sole and exclusive property of the ACP. The slide sets may be used only by the person who downloads or purchases them and only for the purpose of presenting them during not-for-profit educational activities. Users may incorporate the entire slide set or selected individual slides into their own teaching presentations but may not alter the content of the slides in any way or remove the ACP copyright notice. Users may make print copies for use as hand-outs for the audience the user is personally addressing but may not otherwise reproduce or distribute the slides by any means or media, including but not limited to sending them as e-mail attachments, posting them on Internet or Intranet sites, publishing them in meeting proceedings, or making them available for sale or distribution in any unauthorized form, without the express written permission of the ACP. Unauthorized use of the In the Clinic slide sets will constitute copyright infringement.
Healthcare Delivery and Policy, Prevention/Screening.
Results provided by:
Copyright © 2016 American College of Physicians. All Rights Reserved.
Print ISSN: 0003-4819 | Online ISSN: 1539-3704
Conditions of Use
This PDF is available to Subscribers Only