0
Original Research |

Communicating Data About the Benefits and Harms of Treatment: A Randomized Trial FREE

Steven Woloshin, MD, MS; and Lisa M. Schwartz, MD, MS
[+] Article and Author Information

From the Veterans Affairs Outcomes Group, White River Junction, Vermont, and The Dartmouth Institute for Health Policy and Clinical Practice, Lebanon, New Hampshire.


Note: Drs. Woloshin and Schwartz contributed equally to this study; the order of authorship is arbitrary.

Disclaimer: The views expressed herein do not necessarily represent the views of the Department of Veterans Affairs or the U.S. government.

Acknowledgment: The authors thank Alice Andrews, PhD, The Dartmouth Institute for Health Policy and Clinical Practice, for technical support, and Allison Hirst, MS, and Sally Hopewell, PhD, Center for Statistics in Medicine, University of Oxford, for helpful comments on an earlier draft of the manuscript.

Grant Support: By the Attorney General Consumer and Prescriber Education grant program, the Robert Wood Johnson Pioneer Program, and the National Cancer Institute.

Potential Conflicts of Interest: Disclosures can be viewed at www.acponline.org/authors/icmje/ConflictOfInterestForms.do?msNum=M10-2880.

Reproducible Research Statement:Study protocol: Available at http://clinicaltrials.gov/ct2/show/NCT00950014. Statistical code and data set: Not available.

Requests for Single Reprints: Lisa M. Schwartz, MD, MS, Veterans Affairs Outcomes Group (111B), Department of Veterans Affairs Medical Center, White River Junction, VT 05009.

Current Author Addresses: Drs. Woloshin and Schwartz: Veterans Affairs Outcomes Group (111B), Department of Veterans Affairs Medical Center, White River Junction, VT 05009.

Author Contributions: Conception and design: S. Woloshin, L.M. Schwartz.

Analysis and interpretation of the data: S. Woloshin, L.M. Schwartz.

Drafting of the article: S. Woloshin, L.M. Schwartz.

Critical revision of the article for important intellectual content: S. Woloshin, L.M. Schwartz.

Final approval of the article: S. Woloshin, L.M. Schwartz.

Statistical expertise: S. Woloshin, L.M. Schwartz.

Obtaining of funding: S. Woloshin, L.M. Schwartz.


Ann Intern Med. 2011;155(2):87-96. doi:10.7326/0003-4819-155-2-201107190-00004
Text Size: A A A

Background: Despite limited evidence, it is often asserted that natural frequencies (for example, 2 in 1000) are the best way to communicate absolute risks.

Objective: To compare comprehension of treatment benefit and harm when absolute risks are presented as natural frequencies, percents, or both.

Design: Parallel-group randomized trial with central allocation and masking of investigators to group assignment, conducted through an Internet survey in September 2009. (ClinicalTrials.gov registration number: NCT00950014)

Setting: National sample of U.S. adults randomly selected from a professional survey firm's research panel of about 30 000 households.

Participants: 2944 adults aged 18 years or older (all with complete follow-up).

Intervention: Tables presenting absolute risks in 1 of 5 numeric formats: natural frequency (x in 1000), variable frequency (x in 100, x in 1000, or x in 10 000, as needed to keep the numerator >1), percent, percent plus natural frequency, or percent plus variable frequency.

Measurements: Comprehension as assessed by 18 questions (primary outcome) and judgment of treatment benefit and harm.

Results: The average number of comprehension questions answered correctly was lowest in the variable frequency group and highest in the percent group (13.1 vs. 13.8; difference, 0.7 [95% CI, 0.3 to 1.1]). The proportion of participants who “passed” the comprehension test (≥13 correct answers) was lowest in the natural and variable frequency groups and highest in the percent group (68% vs. 73%; difference, 5 percentage points [CI, 0 to 10 percentage points]). The largest format effect was seen for the 2 questions about absolute differences: the proportion correct in the natural frequency versus percent groups was 43% versus 72% (P < 0.001) and 73% versus 87% (P < 0.001).

Limitation: Even when data were presented in the percent format, one third of participants failed the comprehension test.

Conclusion: Natural frequencies are not the best format for communicating the absolute benefits and harms of treatment. The more succinct percent format resulted in better comprehension: Comprehension was slightly better overall and notably better for absolute differences.

Primary Funding Source: Attorney General Consumer and Prescriber Education grant program, the Robert Wood Johnson Pioneer Program, and the National Cancer Institute.

Editors' Notes
Context

  • Optimal ways of communicating the potential benefits and harms of treatments are not clear.

Contribution

  • This randomized trial involving 2944 adults compared 5 numeric formats for presenting outcomes that could occur with drug treatment. People seemed to best understand information that was presented as a simple percentage. Even with the percent format, however, about one third of participants had difficulty understanding data about benefit and harm risks.

Caution

  • The researchers studied adults who agreed to participate in a survey about 2 hypothetical drug treatments rather than patients facing actual medical decisions.

Implication

  • Presenting probable treatment outcomes in a simple percentage format might improve comprehension.

—The Editors

To make informed decisions, patients need to understand what is likely to happen with and without treatment. It is widely accepted that natural frequencies (for example, 2 in 1000 persons) are the best way to communicate these absolute risks. Major organizations, such as the Cochrane Collaboration (12), the International Patient Decision Aid Standards Collaboration (3), and the Medicines and Healthcare products Regulatory Agency (4) (the United Kingdom's equivalent of the U.S. Food and Drug Administration) all recommend using natural frequencies to present absolute risks.

The evidence behind these recommendations, however, is limited and is extrapolated from studies in a very specialized context: Bayesian probability revisions in diagnostic testing. On the basis of the only randomized trial identified (which included 60 students) (5), a 2004 systematic review recommended natural frequencies over percents (6). A 2009 systematic review (7) also considering only probability revisions did not recommend use of natural frequencies because it identified an additional series of randomized trials refuting the superiority of natural frequencies (8). Most important, the only 2 direct tests of absolute risk formats for communicating treatment effects found small differences favoring percents (910). Because these trials tested only simple, artificial scenarios in highly educated, self-selected convenience samples (people actively seeking information on Harvard University's “Your Cancer Risk” Web site ([www.diseaseriskindex.harvard.edu/update/]), the findings might not be true in more typical settings.

We conducted a randomized trial comparing comprehension of the benefits and harms of drugs when absolute risks are presented as natural frequencies, percents, or both. To test the formats among typical people facing typical decisions, we used familiar conditions, presented multiple absolute risks (because treatments have multiple benefits and harms), and recruited a nationally representative sample of U.S. adults.

Study Design

This parallel-group randomized trial, conducted and completed in September 2009, compared numerical formats for presenting absolute risks (Figure 1). Participants were members of a nationally representative research panel who had previously signed privacy statements agreeing to receive e-mail invitations to participate in surveys. Invitations included a link that let them see the study's purpose (our link read, “ways to provide information about prescription drugs”), the time required to complete the survey, and a reminder that participation was voluntary; for the text of the survey, see the Supplement. Participants were allocated in a 1:1 ratio to receive absolute risks in 1 of 5 numeric formats; they were not told about the testing of alternative formats. The Committee for the Protection of Human Subjects at Dartmouth Medical School approved the study. The protocol was registered with ClinicalTrials.gov before recruitment began (NCT00950014).

Setting and Participants

Participants were recruited from a research panel created by Knowledge Networks (Menlo Park, California), a professional survey firm. The panel consists of about 30 000 households recruited by probability methods (random-digit dialing of U.S. residential landlines, supplemented by address-based sampling to capture cell phone–only households). In return for free Internet access and cash incentives (or necessary computer equipment), panelists agreed to receive e-mail invitations to participate in surveys. The ability of the panel to produce nationally representative samples has been demonstrated in head-to-head studies in comparison with random-digit dialing techniques (1113). The National Science Foundation uses the Knowledge Networks panel for its grant program involving general population experiments (14).

For our study, the survey firm invited a simple random sample to participate (Figure 1). Eligibility criteria (established before recruitment began) were age 18 years or older and ability to complete the survey on a computer (<9% of panel members were excluded because they only had Web TV, which has insufficient resolution to display the images in a readable format). Over the next 2 weeks, nonrespondents received e-mail reminders or an automated telephone call. Of the 4316 persons invited, 2944 (68%) agreed to participate.

Randomization and Intervention

Participants were randomly assigned to 1 of the 5 numeric presentation groups (without stratification or blocking) by using a central computerized random-number generator to ensure allocation concealment. The survey firm programmed the random-number generation and allocation to happen immediately after participants clicked on the link to participate in the survey.

All data were presented in a standard layout (drug facts boxes)—a single-page summary including a narrative description of what the drug is for, along with a data table with absolute risks and differences for the beneficial and harmful outcomes in the drug and placebo groups (15). Participants were shown boxes for 2 hypothetical drugs that are used to treat familiar conditions—a heartburn drug that dramatically reduces a common symptom and a cholesterol drug that reduces uncommon events (the chance of dying of a heart attack) and has a very uncommon side effect (muscle breakdown)—which allowed us to present absolute risks across a broad range of magnitude. To make the boxes realistic, we adapted data from trials of actual drugs (lansoprazole and simvastatin) and masked the drug identities with false names (Paxcid and Questor).

The drug boxes for each study group were identical except for the numerical formats used to express the absolute risks (Figure 2):

Grahic Jump Location
Figure 2.
Data presentations in the 3 numeric format groups.

The percent plus natural frequency and variable frequency–alone format groups are not shown but can be constructed on the basis of the numbers in the figure.

Grahic Jump Location

  1. Natural frequency: Absolute risks and differences were expressed as whole numbers per 1000 (for example, 20 in 1000). Because natural frequencies are the most commonly recommended format, we used them as the reference group.

  2. Variable frequency: Absolute risks and differences were expressed as frequencies in which the denominator was adjusted so that it is the smallest multiple of 10 necessary to keep the numerator greater than 1. For example, 2% is expressed as 2 in 100, 0.2% is 2 in 1000, and 0.02% is 2 in 10 000. To minimize confusion, denominators varied only between table rows.

  3. Percent: Absolute risks and differences were expressed as percents rounded to whole numbers, unless decimals were needed to see the absolute difference (for example, 3.3% [placebo group] vs. 2.5% [drug group] = 0.8%).

  4. Percent plus natural frequency: Absolute risks were expressed as both percent and a natural frequency (x in 1000). To avoid data overload, absolute differences were expressed only as percents.

  5. Percent plus variable frequency: Absolute risks were expressed as both percent and a variable frequency (as explained above). To avoid data overload, absolute differences were expressed as percents.

Outcomes and Follow-up
Survey

The online survey, which took about 20 minutes to complete, measured comprehension and judgments of drug benefits and harms and the helpfulness of data (screen shots are shown in the Supplement). Because our goal was to test understanding rather than recall, a drug box remained on the screen with each question.

We conducted 2 pilot tests to ensure that the online process worked. The first pilot (42 participants) tested the recruitment and randomization procedures, debugged the survey, and identified questions with high item nonresponse. The second pilot (106 participants) assessed whether questions were understandable by including open-ended responses after the questions (for example, “Please explain why you answered false”). No pretest data were included in the final analyses.

Primary Outcome Measure

The primary outcome was comprehension of data presented in the drug boxes, assessed with 18 questions (9 per drug box); these were a mixture of true-or-false and fill-in-the-blank questions. Some questions involved judging the direction of an effect (for example, true or false: “Paxcid and placebo work equally well at completely relieving heartburn”), and others involved quantifying effects (for example, true or false: “People given Questor were twice as likely to have bothersome muscle aches as people given placebo”). Twelve questions were identical for all 5 format groups. The other 6 questions varied only in that the numeric format in the question matched that of the presentation (for example, “2% more people” in the percent group compared with “20 in 1000 more” in the natural frequency group).

We scored comprehension in 3 ways: mean number of correct answers, the proportion of persons who “passed” the test (the score closest to 70%, or ≥13 correct responses out of 18 questions), and the proportion of persons who got an “A” grade (the score closest to 90%, or ≥16 correct responses out of 18 questions).

Statistical Analysis

We used the results from an earlier study involving similar questions assessing comprehension of drug box data (15) as the basis for our sample size calculations; that study demonstrated that about 80% of persons who were shown drug boxes in the percent plus natural frequency format achieved a “passing” grade. We asserted that a 10–percentage point change in the proportion of persons who passed would be clinically important. To be conservative (that is, to maximize the required sample size), we used pass rates of 80% versus 70% in our calculation. To account for multiple comparisons, we set the 2-sided α value to 0.0125 because comparing each data format against natural frequency required 4 pairwise comparisons (0.05 ÷ 4 = 0.0125). We needed about 550 people in each group to have 90% power to detect a difference of 10 percentage points or greater.

For the primary outcome measure (comprehension), missing answers were considered incorrect; for judgments and helpfulness ratings, missing answers were considered to represent “no opinion.” The proportion of missing answers ranged from less than 1% to 5%, and sensitivity analyses (complete case analysis) yielded nearly identical results. The natural frequency group was the reference category for all comparisons. We used t tests for differences in means, 2-sample tests of proportions to test for differences in proportions, and chi-square tests for differences in ordinal variables. All comparisons were 2-sided and were considered statistically significant at P values less than 0.01. All analyses were done in Stata, version 11 (StataCorp, College Station, Texas).

Role of the Funding Source

This study was supported by the Attorney General Consumer and Prescriber Education grant program, the Robert Wood Johnson Pioneer Program, and the National Cancer Institute. The funding sources had no role in study design, data collection and analysis, manuscript preparation, or the decision to submit the manuscript for publication.

Demographic characteristics were similar across the 5 numeric format groups (Table 1). The mean participant age was 47 years (range, 18 to 93 years), and 53% were women. Six percent of participants had less than a high school education; 38% had a college degree or higher. There were no crossovers during the trial.

Table Jump PlaceholderTable 1.  Participant Characteristics
Comprehension of Benefits and Harms

Figure 3 shows that the mean number of comprehension questions answered correctly was lowest in the variable frequency group and highest in the percent group (13.1 vs. 13.8; difference, 0.7 [95% CI, 0.3 to 1.1]; P < 0.001). Comprehension in the natural frequency group was nominally lower than that in the percent group: 13.4 vs. 13.8 correct (difference, 0.4 [CI, 0.1 to 0.8]; P = 0.03). The natural frequency and variable frequency groups had the lowest proportion of persons who passed the comprehension test; the percent group had the highest. The same pattern held for the proportion of participants who got an “A” grade.

Grahic Jump Location
Figure 3.
Comprehension in the 5 numeric format groups overall and among participants with low numeracy.

There were 2944 participants overall and 1037 with low numeracy. “Low numeracy” is defined as answering 0 or 1 of the 3 numeracy questions correctly. The error bars represent the upper bound of the 95% CI.

Grahic Jump Location

In 4 of the 18 questions, the absolute difference across the formats exceeded 10 percentage points (the threshold that we asserted was important). In the first question—whether serious muscle breakdown is much less common than minor liver inflammation with the cholesterol drug—the variable frequency group did worse than the natural frequency group (44% vs. 58% correct; P < 0.001).

For the other 3 questions in which format mattered the most, comprehension was substantially higher in the percent and both percent plus frequency groups than in both frequency-only groups (Appendix Table). Two questions were about absolute differences (for example, “How many fewer people had a heart attack with the cholesterol drug than with placebo?”). For these 2 questions, the proportion answering correctly for the natural frequency versus the percent group was 43% versus 72% (difference, 29 percentage points [CI, 23 to 34 percentage points]; P < 0.001) and 73% versus 87% (difference, 14 percentage points [CI, 9 to 18 percentage points]; P < 0.001). The third question involved identifying a particular data item.

Table Jump PlaceholderAppendix Table.  Proportion of Participants Who Correctly Answered the 18 Comprehension Questions

Among the 7 questions about low probability events (those with <1% chance of occurring), comprehension did not differ between the natural frequency and percent groups.

Comprehension, by Numeracy and Education

The finding that comprehension of natural frequency was somewhat lower than comprehension of percents was consistent across levels of numeracy and education. For example, Figure 3 (bottom) shows that the pattern of findings for the 1037 respondents with low numeracy (those who answered ≤1 of 3 numeracy question correctly [16]) was similar to that in the sample as a whole; as expected, all comprehension measures were lower for this subgroup than in the overall sample. A similar pattern was evident among the 909 people with a high school education or less: The proportion of persons who passed was lower in the natural frequency group than the percent group (53% vs. 63% [difference, 10 percentage points [CI, 0 to 20 percentage points]), but the proportion of persons who got an “A” grade did not differ (18% vs. 17% [difference, −1 percentage point [CI, −9 to 7 percentage points]). The same pattern was also evident among those with the highest level of numeracy and education (those with a postgraduate degree).

Judgments of Benefit and Harm

When the absolute differences were numerically small, the natural frequency group judged benefits and harms as being larger than the percent group did (Table 2). For example, 43% of the natural frequency group said that the side effects of the heartburn drug (expressed as “40 in 1000 more” had diarrhea) was moderate or larger, whereas only 26% of the percent group (in which this information was expressed as “4% more” had diarrhea) came to the same conclusion (P < 0.001). Consistently, fewer persons in the natural frequency group than the percent group thought that the benefits of the heartburn drug were definitely worth the side effects (35% vs. 47%; P = 0.002).

Table Jump PlaceholderTable 2.  Perception of Drug Benefit and Harm in the 5 Numeric Format Groups
Helpfulness Ratings

Each group rated the data similarly in how they helped them understand drug benefit: The proportion that responded “helped me a lot” ranged from 56% in the percent group to 61% in both of the percent plus frequency groups. Ratings of harm data were similar: The proportion that chose “helped me a lot” ranged from 58% in the percent group to 64% in the percent plus variable frequency group. None of the differences was statistically significant.

We found no evidence to support the assertion that natural frequency is the best format for communicating the benefits and harms of treatment. In fact, the percent format had slightly higher comprehension overall and at each level of numeracy and education. The combined percent plus natural frequency format was no better than the percent format alone. Comprehension of the variable frequency format was consistently lowest.

The use of natural frequencies instead of percents to communicate absolute risks has been promoted largely on intuitive and evolutionary grounds (the human mind developed the ability to learn over thousands of years by observing and counting things; in contrast, the science of probability is only a few hundred years old) (5). However, the evidence supporting natural frequencies over percents for communicating to patients (based on 2 systematic reviews [67] and our English-language MEDLINE searches to April 2011) is limited to trials testing a specific skill: the ability to use conditional probabilities when interpreting diagnostic test results. In fact, the 2 trials testing absolute risk formats for communicating treatment effects found small differences for percents over natural frequencies (910). Nevertheless, even iconic “evidence-based” organizations have issued guidance promoting the use of natural frequencies over percents for communicating treatment effects.

Our findings challenge such guidance. They also refute the common assumption that percents should be avoided for expressing small probabilities (for example, <1%). We previously made this assumption after repeatedly finding that study participants had the most difficulty converting “1 in 1000” to “0.1%” in our 3-item numeracy test (17)—a finding also observed in the current trial. The fact that comprehension of the 7 questions about low-probability events was the same in the percent group and the 2 natural frequency groups argues that the difficulty in converting between formats reflects trouble with manipulating decimal points rather than a comprehension problem.

Our study also highlights known problems with frequency formats. People get confused when the denominator changes—for example, deciding whether 1 in 130 or 1 in 236 is a larger number (910, 1819). We wondered whether limiting denominator changes to orders of magnitude (for example, 100, 1000, and 10 000) and keeping denominators constant within rows of tables would minimize confusion and enhance the ability to discriminate between varying probabilities. Unfortunately, this format was still confusing.

Variable frequency formats may be confusing because the larger number in the denominator means a smaller probability. Another reason for confusion is “denominator neglect” (20): People tend to focus on the numerator of a frequency and ignore the denominator. This problem is best illustrated with the variable frequency format in the cholesterol drug table, where the chance of serious muscle breakdown was 4 in 10 000 and the chance of liver inflammation was 1 in 100. Only 40% of participants in that group correctly identified serious muscle breakdown as the less common event. These incorrect responses probably reflect comparison of numerators (4 vs. 1) without considering the denominators (10 000 vs. 100).

Denominator neglect may cause problems even when the denominator is held constant, as in our natural frequency format (always x in 1000) because it magnifies numerically small effects. For example, the increase in diarrhea with the heartburn drug looked bigger to the natural frequency group (presented as “40 in 1000”) than the percent group (presented as “4%”). Heightened perception of adverse effects may explain why the natural frequency group had less enthusiasm for the heartburn drug than the percent group did. Denominator neglect matters when it leads people away from a good intervention because of a format distortion rather than a balanced weighing of benefits and harms.

In theory, combined formats should be best because they give people options and reinforce understanding by presenting the same data in different ways. We found that combined formats generally worked better than frequency formats alone: Comprehension was higher, and there was no evidence of denominator neglect. However, they worked no better than percents alone, and they are therefore probably not worth the additional visual clutter (they triple the number of values presented).

Our trial has several important strengths. We used a rich set of comprehension questions to assess understanding of both relative and absolute differences (including small and large magnitudes) within and across rows of a complex table. There were no study dropouts (randomization occurred after potential participants agreed to complete the online survey), and item nonresponse was low (≤5%). In contrast to much of the prior research using convenience samples, our participants were recruited from a large national research panel that, by design, can be weighted so that results are nationally representative (that is, they account for the sampling strategy and panel recruitment). Weighting accounts for study nonparticipation and adjusts the demographic characteristics of the panel members to match the U.S. population on age, sex, race, education, region, and metropolitan residence. Weighted and unweighted results were nearly identical. We chose to present unweighted results to preserve the simplicity of the randomized trial. The negligible effect of weighting

does suggest that the unweighted results are nationally generalizable.

Our findings should be interpreted in light of several limitations. First, comprehension was tested in a survey rather than in the setting of actual medical decisions. In addition, there is no clear standard for the level of comprehension needed to make an informed decision. That is why we judged comprehension in 3 ways: mean score, proportion of persons who “passed” the test, and the proportion that received an “A” grade. Because setting thresholds is inherently arbitrary, we adapted a familiar external benchmark—school grades (“passing” is >70% correct, and an “A” grade is >90%), a strategy we used previously (21).

Finally, there is room for improvement. Even with the percent format, about one third of participants failed the comprehension test. This may in part reflect that participants were facing hypothetical decisions. Patients facing real decisions might have been more engaged and done better. However, part of the problem undoubtedly reflects a poor understanding of numbers. While it is tempting to conclude that none of the formats is adequate, it is important to consider the complexity of the tasks involved. Participants had to navigate complex tables to find and compare numbers. The National Assessment of Adult Literacy considers such tasks to be among the most difficult that they assess (22). In the 2003 survey, only 53% of respondents could use a simple table to find and compare bank interest rates (requiring either subtracting or dividing 2 numbers). The fact that pass rates in our trial increased directly with education (ranging from 62% for persons with high school education or less to 85% for those with a postgraduate degree) and with numeracy (ranging from 56% for persons with low numeracy to 92% for those with the highest numeracy) highlights that although data formats matter, the main underlying need is for better education. Fortunately, evidence indicates that even a simple educational intervention can help (21). It is also possible that comprehension would improve over time through regular exposure to absolute risks in standardized formats. Leaders in risk communication believe that it is incumbent on policymakers to move in this direction to help the public make decisions in their own interest (23).

People who are trying to communicate data about benefit and harm to the public, patients, physicians, and policymakers must choose a format for absolute risks. Our trial shows that they should avoid variable frequencies and that they should no longer accept the assertion that natural frequencies are the best format. On the basis of our findings, we believe that the percent format is probably best. It is more succinct (requiring one half as many numbers) and slightly better than the natural frequency format, particularly for communicating the most basic data needed to compare treatment effects: absolute differences.

Rosenbaum SE, Glenton C, Oxman AD.  Summary-of-findings tables in Cochrane reviews improved understanding and rapid retrieval of key information. J Clin Epidemiol. 2010; 63:620-6.
PubMed
CrossRef
 
Schünemann HJ, Oxman AD, Higgins JP, Vist GE, Glasziou P, Guyatt GH.  Presenting results and “Summary of findings” tables. In: The Cochrane Handbook. 2009. Accessed atwww.mrc-bsu.cam.ac.uk/cochrane/handbook/on 24 November 2010.
 
Elwyn G, O'Connor A, Stacey D, Volk R, Edwards A, Coulter A, et al. International Patient Decision Aids Standards (IPDAS) Collaboration.  Developing a quality criteria framework for patient decision aids: online international Delphi consensus process. BMJ. 2006; 333:417.
PubMed
 
Medicines and Healthcare products Regulatory Agency.  Guidance on communication of risks and benefits in patient information leaflets. 2005. Accessed atwww.mhra.gov.uk/Howweregulate/Medicines/Medicinesregulatorynews/CON049410on 31 May 2011.
 
Gigerenzer G, Hoffrage U.  How to improve Bayesian reasoning without instruction: frequency formats. Psychol Rev. 1995; 102:684-704.
 
Trevena LJ, Davey HM, Barratt A, Butow P, Caldwell P.  A systematic review on communicating with patients about evidence. J Eval Clin Pract. 2006; 12:13-23.
PubMed
 
Visschers VH, Meertens RM, Passchier WW, de Vries NN.  Probability information in risk communication: a review of the research literature. Risk Anal. 2009; 29:267-87.
PubMed
 
Girotto V, Gonzalez M.  Solving probabilistic and statistical problems: a matter of information structure and question form. Cognition. 2001; 78:247-76.
PubMed
 
Cuite CL, Weinstein ND, Emmons K, Colditz G.  A test of numeric formats for communicating risk probabilities. Med Decis Making. 2008; 28:377-84.
PubMed
 
Waters EA, Weinstein ND, Colditz GA, Emmons K.  Formats for improving risk communication in medical tradeoff decisions. J Health Commun. 2006; 11:167-82.
PubMed
 
Baker LB, Singer S, Wagner T.  Validity of the Survey of Health and Internet and Knowledge Network's Panel and Sampling. Stanford: Health Economics Resource Center; 2003. Accessed atwww.knowledgenetworks.com/ganp/reviewer-info.htmlon 31 May 2011.
 
Chang L, Krosnick J.  National surveys via RDD telephone interviewing versus the internet: comparing sample representativeness and response quality. Public Opin Q. 2009; 73:641-78.
 
Krotki K, Dennis JM.  Probability-based survey research on the internet. Presented at the 53rd Conference of the International Statistical Institute, 22–29 August 2001, Seoul, Korea. Accessed atwww.knowledgenetworks.com/ganp/docs/ISI-2001-confernce-paper.pdfon 16 May 2011.
 
Time-sharing Experiments for the Social Sciences (TESS). Accessed athttp://tess.experimentcentral.org/on 13 December 2010.
 
Schwartz LM, Woloshin S, Welch HG.  Using a drug facts box to communicate drug benefits and harms: two randomized trials. Ann Intern Med. 2009; 150:516-27.
PubMed
 
Schwartz LM, Woloshin S, Black WC, Welch HG.  The role of numeracy in understanding the benefit of screening mammography. Ann Intern Med. 1997; 127:966-72.
PubMed
 
Gigerenzer G, Gaissmaier W, Kurz-Milcke E, Schwartz L, Woloshin S.  Helping doctors and patients make sense of health statistics. Psychological Science in the Public Interest. 2008; 8:53-96.
 
Grimes DA, Snively GR.  Patients' understanding of medical risks: implications for genetic counseling. Obstet Gynecol. 1999; 93:910-4.
PubMed
 
Woloshin S, Schwartz LM, Byram S, Fischhoff B, Welch HG.  A new scale for assessing perceptions of chance: a validation study. Med Decis Making. 2000; 20:298-307.
PubMed
 
Yamagishi K.  When a 12.86% mortality is more dangerous than a 24.14%: implication for risk communication. Appl Cogn Psychol. 1997; 11:495-506.
 
Woloshin S, Schwartz LM, Welch HG.  The effectiveness of a primer to help people understand risk: two randomized trials in distinct populations. Ann Intern Med. 2007; 146:256-65.
PubMed
 
National Center for Education Statistics, U.S. Department of Education.  National Assessment of Adult Literacy (NAAL). 2003. Accessed athttp://nces.ed.gov/naal/index.aspon 9 March 2011.
 
Fischhoff B.  Questions of competence: the duty to inform and limits to choices. In: The Behavioral Foundations of Policy. Princeton: Princeton Univ Pr; 2010.
 

Figures

Grahic Jump Location
Figure 2.
Data presentations in the 3 numeric format groups.

The percent plus natural frequency and variable frequency–alone format groups are not shown but can be constructed on the basis of the numbers in the figure.

Grahic Jump Location
Grahic Jump Location
Figure 3.
Comprehension in the 5 numeric format groups overall and among participants with low numeracy.

There were 2944 participants overall and 1037 with low numeracy. “Low numeracy” is defined as answering 0 or 1 of the 3 numeracy questions correctly. The error bars represent the upper bound of the 95% CI.

Grahic Jump Location

Tables

Table Jump PlaceholderTable 1.  Participant Characteristics
Table Jump PlaceholderAppendix Table.  Proportion of Participants Who Correctly Answered the 18 Comprehension Questions
Table Jump PlaceholderTable 2.  Perception of Drug Benefit and Harm in the 5 Numeric Format Groups

References

Rosenbaum SE, Glenton C, Oxman AD.  Summary-of-findings tables in Cochrane reviews improved understanding and rapid retrieval of key information. J Clin Epidemiol. 2010; 63:620-6.
PubMed
CrossRef
 
Schünemann HJ, Oxman AD, Higgins JP, Vist GE, Glasziou P, Guyatt GH.  Presenting results and “Summary of findings” tables. In: The Cochrane Handbook. 2009. Accessed atwww.mrc-bsu.cam.ac.uk/cochrane/handbook/on 24 November 2010.
 
Elwyn G, O'Connor A, Stacey D, Volk R, Edwards A, Coulter A, et al. International Patient Decision Aids Standards (IPDAS) Collaboration.  Developing a quality criteria framework for patient decision aids: online international Delphi consensus process. BMJ. 2006; 333:417.
PubMed
 
Medicines and Healthcare products Regulatory Agency.  Guidance on communication of risks and benefits in patient information leaflets. 2005. Accessed atwww.mhra.gov.uk/Howweregulate/Medicines/Medicinesregulatorynews/CON049410on 31 May 2011.
 
Gigerenzer G, Hoffrage U.  How to improve Bayesian reasoning without instruction: frequency formats. Psychol Rev. 1995; 102:684-704.
 
Trevena LJ, Davey HM, Barratt A, Butow P, Caldwell P.  A systematic review on communicating with patients about evidence. J Eval Clin Pract. 2006; 12:13-23.
PubMed
 
Visschers VH, Meertens RM, Passchier WW, de Vries NN.  Probability information in risk communication: a review of the research literature. Risk Anal. 2009; 29:267-87.
PubMed
 
Girotto V, Gonzalez M.  Solving probabilistic and statistical problems: a matter of information structure and question form. Cognition. 2001; 78:247-76.
PubMed
 
Cuite CL, Weinstein ND, Emmons K, Colditz G.  A test of numeric formats for communicating risk probabilities. Med Decis Making. 2008; 28:377-84.
PubMed
 
Waters EA, Weinstein ND, Colditz GA, Emmons K.  Formats for improving risk communication in medical tradeoff decisions. J Health Commun. 2006; 11:167-82.
PubMed
 
Baker LB, Singer S, Wagner T.  Validity of the Survey of Health and Internet and Knowledge Network's Panel and Sampling. Stanford: Health Economics Resource Center; 2003. Accessed atwww.knowledgenetworks.com/ganp/reviewer-info.htmlon 31 May 2011.
 
Chang L, Krosnick J.  National surveys via RDD telephone interviewing versus the internet: comparing sample representativeness and response quality. Public Opin Q. 2009; 73:641-78.
 
Krotki K, Dennis JM.  Probability-based survey research on the internet. Presented at the 53rd Conference of the International Statistical Institute, 22–29 August 2001, Seoul, Korea. Accessed atwww.knowledgenetworks.com/ganp/docs/ISI-2001-confernce-paper.pdfon 16 May 2011.
 
Time-sharing Experiments for the Social Sciences (TESS). Accessed athttp://tess.experimentcentral.org/on 13 December 2010.
 
Schwartz LM, Woloshin S, Welch HG.  Using a drug facts box to communicate drug benefits and harms: two randomized trials. Ann Intern Med. 2009; 150:516-27.
PubMed
 
Schwartz LM, Woloshin S, Black WC, Welch HG.  The role of numeracy in understanding the benefit of screening mammography. Ann Intern Med. 1997; 127:966-72.
PubMed
 
Gigerenzer G, Gaissmaier W, Kurz-Milcke E, Schwartz L, Woloshin S.  Helping doctors and patients make sense of health statistics. Psychological Science in the Public Interest. 2008; 8:53-96.
 
Grimes DA, Snively GR.  Patients' understanding of medical risks: implications for genetic counseling. Obstet Gynecol. 1999; 93:910-4.
PubMed
 
Woloshin S, Schwartz LM, Byram S, Fischhoff B, Welch HG.  A new scale for assessing perceptions of chance: a validation study. Med Decis Making. 2000; 20:298-307.
PubMed
 
Yamagishi K.  When a 12.86% mortality is more dangerous than a 24.14%: implication for risk communication. Appl Cogn Psychol. 1997; 11:495-506.
 
Woloshin S, Schwartz LM, Welch HG.  The effectiveness of a primer to help people understand risk: two randomized trials in distinct populations. Ann Intern Med. 2007; 146:256-65.
PubMed
 
National Center for Education Statistics, U.S. Department of Education.  National Assessment of Adult Literacy (NAAL). 2003. Accessed athttp://nces.ed.gov/naal/index.aspon 9 March 2011.
 
Fischhoff B.  Questions of competence: the duty to inform and limits to choices. In: The Behavioral Foundations of Policy. Princeton: Princeton Univ Pr; 2010.
 

Letters

NOTE:
Citing articles are presented as examples only. In non-demo SCM6 implementation, integration with CrossRef’s "Cited By" API will populate this tab (http://www.crossref.org/citedby.html).

Comments

Submit a Comment
Choice of Methods in Presenting Scenarios to Patients
Posted on July 27, 2011
David A. Nardone
Oregon Health & Sciences University
Conflict of Interest: None Declared

After reading the article by Woloshin and Schwartz (1) on communicating risk, I reviewed a similar publication by the same authors (2). I do not disagree with the findings in either study, but I believe the problem is the choice of methods in presenting scenarios to patients. The key is to use concrete and personalized images that give meaning to abstract concepts of benefit and harm (3). Begin with using an appropriate frame of reference. If the denominator is 100, use the number of US Senators; if 1000, cite the number of soldiers in a battalion; if 10,000 use the capacity of the local AAA baseball team; if 1000,000 refer to the attendance at the annual OSU/Michigan football game. To review comparative data, ask the patient to imagine two separate barrels of ping-pong balls. One barrel is marked with the intervention (Questor) and the other placebo. In the Questor barrel, 25 ping-pong balls are red (heart attack), and 975 are white (no heart attack). For the placebo, 33 ping-pong balls are red and 967 white. Repeat the scenarios for side effects. Finally, ask the patient to imagine they are blindfolded and are reaching into each barrel and request they assess how much risk they are willing to assume to achieve the stated benefit. For primers in this area, the comparative scenarios can be illustrated in drawings.

David A. Nardone, MD (Retired) Clinical Director Primary Care Staff Physician VHA Medical Center Professor Emeritus Oregon Health & Sciences University 6714 NE Copper Beech Drive Hillsboro, OR 97124-5094

References

1. Woloshin S, Schwartz LM. Communicating Data about the Benefits and Harms of Treatment: A Randomized Trial. Ann Intern Med 2011; 155: 87-96.

2. Woloshin S, Schwartz, LM, Welch G. The Effectiveness of a Primer to Help People Understand Risk. Ann Intern Med 2007; 146: 256-265.

3. Covello VT, Sandman PM, Slovic P. Risk Communication, Risk Statistics, and Risk Comparisons: A Manual for Plant Mangers. Chemical Manufactuers, 1988, Washington, DC, page 15.

Conflict of Interest:

None declared

Choice of Methods in Presenting Scenarios to Patients
Posted on October 4, 2011
Steven Woloshin
VA Outcomes Group, White River Jct., VT and the Dartmouth Institute for Health Policy and Clinical P
Conflict of Interest: None Declared

Dr. Nardone's suggestion is intended to help people develop a sense of numbers using concrete images.

For this technique to work, though, it would be important to use images familiar to the target audience. We doubt that most Americans know how many soldiers are in a batallion (we don't)? Or the capacity of the field for the local AAA baseball team (we don't even know if we have a AAA team)? Or even the number of US Senators (we knew this one)?

Even with familiar images, though, there may still be problems. Changing the denominators to accommodate chances of different magnitude may undermine communication. In our trial, people had the most trouble understandingthe variable frequency format where denominators changed by orders of magnitude (e.g. 100, 1,000, 10,000).

While Dr. Nardone's approach may be useful to teach concepts, it would be not be feasible for the kinds of applications we envision: efficiently summarizing the multiple benefits and harms of medical interventions.

Steven Woloshin, MD, MS and Lisa M. Schwartz, MD, MS

VA Outcomes Group, White River Jct., VT and the Dartmouth Institute for Health Policy and Clinical Practice

Conflict of Interest:

None declared

Submit a Comment

Supplements

Summary for Patients

Clinical Slide Sets

Terms of Use

The In the Clinic® slide sets are owned and copyrighted by the American College of Physicians (ACP). All text, graphics, trademarks, and other intellectual property incorporated into the slide sets remain the sole and exclusive property of the ACP. The slide sets may be used only by the person who downloads or purchases them and only for the purpose of presenting them during not-for-profit educational activities. Users may incorporate the entire slide set or selected individual slides into their own teaching presentations but may not alter the content of the slides in any way or remove the ACP copyright notice. Users may make print copies for use as hand-outs for the audience the user is personally addressing but may not otherwise reproduce or distribute the slides by any means or media, including but not limited to sending them as e-mail attachments, posting them on Internet or Intranet sites, publishing them in meeting proceedings, or making them available for sale or distribution in any unauthorized form, without the express written permission of the ACP. Unauthorized use of the In the Clinic slide sets will constitute copyright infringement.

Toolkit

Want to Subscribe?

Learn more about subscription options

Advertisement
Related Articles
Related Point of Care
Topic Collections
PubMed Articles
Forgot your password?
Enter your username and email address. We'll send you a reminder to the email address on record.
(Required)
(Required)