Joy Melnikow, MD, MPH; Joshua J. Fenton, MD, MPH; Evelyn P. Whitlock, MD, MPH; Diana L. Miglioretti, PhD; Meghan S. Weyrich, MPH; Jamie H. Thompson, MPH; Kunal Shah
This article was published at www.annals.org on 12 January 2016.
Disclaimer: This review was conducted by the Kaiser Permanente Research Affiliates Evidence-based Practice Center with the University of California Davis Center for Healthcare Policy and Research under contract to AHRQ. AHRQ staff provided oversight for the project and assisted in the external review of the companion draft evidence synthesis. The analytic framework, review questions, and methods for locating and qualifying evidence were posted on the USPSTF Web site for public comment before the review began; final versions reflect public input. The authors of this report are responsible for its content, including any clinical treatment recommendations. No statement in this article should be construed as an official position of AHRQ or the U.S. Department of Health and Human Services.
Acknowledgment: The authors thank the following for their contributions to this project: AHRQ staff; the USPSTF; Joann Elmore, MD, MPH, Elizabeth Rafferty, MD, Jeffrey Tice, MD, Edward Sickles, MD, Barnett Kramer, MD, MH, Gretchen Gierach, PhD, and Gwendolyn Bryant-Smith, MD, who provided expert and federal partner review of the report; Wendie Berg, MD, PhD, and Christiane Kuhl, MD, for providing unpublished subgroup data; and Bruce Abbott, MLS, and Guibo Xing, PhD, at the University of California, Davis.
Financial Support: By AHRQ (contract HHSA-290-2012-00015-I), Rockville, Maryland.
Disclosures: Dr. Melnikow reports a contract with the Agency for Healthcare Research and Quality during the conduct of the study. Dr. Miglioretti reports grants from the Agency for Healthcare Research and Quality and the National Cancer Institute during the conduct of the study. Ms. Weyrich reports grants from the Agency for Healthcare Research and Quality during the conduct of the study. Ms. Thompson reports grants from the Agency for Healthcare Research and Quality during the conduct of the study. Authors not named here have disclosed no conflicts of interest. Disclosures can also be viewed at www.acponline.org/authors/icmje/ConflictOfInterestForms.do?msNum=M15-1789.
Editors' Disclosures: Christine Laine, MD, MPH, Editor in Chief, reports that she has no financial relationships or interests to disclose. Darren B. Taichman, MD, PhD, Executive Deputy Editor, reports that he has no financial relationships or interests to disclose. Cynthia D. Mulrow, MD, MSc, Senior Deputy Editor, reports that she has no relationships or interests to disclose. Deborah Cotton, MD, MPH, Deputy Editor, reports that she has no financial relationships or interest to disclose. Jaya K. Rao, MD, MHS, Deputy Editor, reports that she has stock holdings/options in Eli Lilly and Pfizer. Sankey V. Williams, MD, Deputy Editor, reports that he has no financial relationships or interests to disclose. Catharine B. Stack, PhD, MS, Deputy Editor for Statistics, reports that she has stock holdings in Pfizer.
Requests for Single Reprints: Reprints are available from the AHRQ Web site (www.ahrq.gov).
Current Author Addresses: Drs. Melnikow and Fenton and Ms. Weyrich: Center for Healthcare Policy and Research, University of California, Davis, 2103 Stockton Boulevard, Sacramento, CA 95817.
Dr. Whitlock and Ms. Thompson: Kaiser Permanente Center for Health Research, 3800 North Interstate Avenue, Portland, OR 97227.
Dr. Miglioretti: Department of Public Health Sciences, University of California Davis School of Medicine, One Shields Avenue, Med Sci 1C, Room 145, Davis, CA 95616.
Mr. Shah: Columbia University, 6380 Lerner Hall, 2920 Broadway, New York, NY 10027.
Author Contributions: Conception and design: J. Melnikow, J.J. Fenton, E.P. Whitlock, D.L. Miglioretti, K. Shah.
Analysis and interpretation of the data: J. Melnikow, J.J. Fenton, E.P. Whitlock, D.L. Miglioretti, M.S. Weyrich, J.H. Thompson, K. Shah.
Drafting of the article: J. Melnikow, J.J. Fenton, M.S. Weyrich, J.H. Thompson, K. Shah.
Critical revision of the article for important intellectual content: J. Melnikow, J.J. Fenton, E.P. Whitlock, D.L. Miglioretti, M.S. Weyrich, J.H. Thompson, K. Shah.
Final approval of the article: J. Melnikow, J.J. Fenton, E.P. Whitlock, D.L. Miglioretti, M.S. Weyrich, J.H. Thompson, K. Shah.
Statistical expertise: D.L. Miglioretti.
Obtaining of funding: J. Melnikow, E.P. Whitlock.
Administrative, technical, or logistic support: J. Melnikow, D.L. Miglioretti, M.S. Weyrich, J.H. Thompson, K. Shah.
Collection and assembly of data: J. Melnikow, J.J. Fenton, E.P. Whitlock, D.L. Miglioretti, M.S. Weyrich, J.H. Thompson, K. Shah.
Melnikow J., Fenton J., Whitlock E., Miglioretti D., Weyrich M., Thompson J., Shah K.; Supplemental Screening for Breast Cancer in Women With Dense Breasts: A Systematic Review for the U.S. Preventive Services Task Force. Ann Intern Med. 2016;164:268-278. doi: 10.7326/M15-1789
Download citation file:
Published: Ann Intern Med. 2016;164(4):268-278.
Published at www.annals.org on 12 January 2016
Screening mammography has lower sensitivity and specificity in women with dense breasts, who experience higher breast cancer risk.
To perform a systematic review of reproducibility of Breast Imaging Reporting and Data System (BI-RADS) density categorization and test performance and clinical outcomes of supplemental screening with breast ultrasonography, magnetic resonance imaging (MRI), and digital breast tomosynthesis (DBT) in women with dense breasts and negative mammography results.
MEDLINE, PubMed, EMBASE, and Cochrane database from January 2000 to July 2015.
Studies reporting BI-RADS density reproducibility or supplemental screening results for women with dense breasts.
Quality assessment and abstraction of 24 studies from 7 countries; 6 studies were good-quality.
Three good-quality studies reported reproducibility of BI-RADS density; 13% to 19% of women were recategorized between “dense” and “nondense” at subsequent screening. Two good-quality studies reported that sensitivity of ultrasonography for women with negative mammography results ranged from 80% to 83%; specificity, from 86% to 94%; and positive predictive value (PPV), from 3% to 8%. The sensitivity of MRI ranged from 75% to 100%; specificity, from 78% to 94%; and PPV, from 3% to 33% (3 studies). Rates of additional cancer detection with ultrasonography were 4.4 per 1000 examinations (89% to 93% invasive); recall rates were 14%. Use of MRI detected 3.5 to 28.6 additional cancer cases per 1000 examinations (34% to 86% invasive); recall rates were 12% to 24%. Rates of cancer detection with DBT increased by 1.4 to 2.5 per 1000 examinations compared with mammography alone (3 studies). Recall rates ranged from 7% to 11%, compared with 7% to 17% with mammography alone. No studies examined breast cancer outcomes.
Good-quality evidence was sparse. Studies were small and CIs were wide. Definitions of recall were absent or inconsistent.
Density ratings may be recategorized on serial screening mammography. Supplemental screening of women with dense breasts finds additional breast cancer but increases false-positive results. Use of DBT may reduce recall rates. Effects of supplemental screening on breast cancer outcomes remain unclear.
Agency for Healthcare Research and Quality.
Dense breasts are defined by mammographic appearance. The American College of Radiology's (ACR's) Breast Imaging Reporting and Data System (BI-RADS) classifies breasts as almost entirely fatty (BI-RADS category a), scattered areas of fibroglandular density (category b), heterogeneously dense (category c), or extremely dense (category d).
About 27.6 million (43%) women aged 40 to 74 years in the United States have dense breasts; most of these are classified as category c (1). Higher breast density is associated with decreased mammographic sensitivity and specificity and also with increased breast cancer risk. The relative hazard of breast cancer for women with dense breasts ranged from 1.50 (women aged 65 to 74 years) to 1.83 (women aged 40 to 49 years) in an analysis of 1 169 248 women enrolled in the Breast Cancer Surveillance Consortium (unpublished data). Increased breast density has been associated with hormone replacement therapy use, younger age, and lower body mass index (2). Data on breast density and race or ethnicity are limited. In the United States, Asian women have higher breast density (3) but lower than average incidence of breast cancer (4). Increased breast density is not associated with higher breast cancer mortality among women with dense breasts diagnosed with breast cancer, after adjustment for stage and mode of detection (5).
Supplemental breast cancer screening with additional screening modalities has been proposed to improve the early detection of breast cancers. No clinical guidelines explicitly recommend use of supplemental breast cancer screening on women with dense breasts (6–9), but as of September 2015, 24 states had enacted legislation requiring that women be notified of breast density with their mammography results; 9 more states are considering mandatory notification (10) (Appendix Table 1). Most states require specific language distinguishing dense (BI-RADS c and d) from nondense breasts, and 4 states require that insurers cover subsequent examinations and tests for women with dense breasts (11–14). Federal legislation requiring breast density notification is pending (15).
Appendix Table 1. Breast Density Legislation in the United States
This report summarizes a systematic review of current evidence on the reproducibility of BI-RADS breast density determinations and on test performance characteristics and outcomes of supplemental screening of women with dense breasts by using hand-held ultrasonography (HHUS), automated whole-breast ultrasonography (ABUS), breast magnetic resonance imaging (MRI), and digital breast tomosynthesis (DBT). Mandatory reporting laws frame notification of women as dense/nondense, so this review focused on this categorization.
The review protocol included an analytic framework with 4 key questions (KQs) (Appendix Figure 1). Detailed methods, including search strategies, detailed inclusion criteria, and excluded studies, are available in the full evidence report (16).
BI-RADS = Breast Imaging Reporting and Data System; DCIS = ductal carcinoma in-situ; KQ = key question; MRI = magnetic resonance imaging.
MEDLINE, PubMed, EMBASE, and the Cochrane Library were searched for relevant English-language studies published between January 2000 and July 2015. We reviewed reference lists from retrieved articles and references suggested by experts.
Two investigators independently reviewed abstracts and full-text articles for inclusion according to predetermined criteria (E.P.W. and J.H.T. for KQ 1, J.M. and J.J.F. for KQs 2 to 4). Included studies examining the reproducibility of BI-RADS breast density categorization focused on asymptomatic women aged 40 years or older undergoing digital or film mammography. Included studies on supplemental screening with HHUS, ABUS, MRI, or DBT reported outcomes for asymptomatic women with dense breasts aged 40 years and older. In studies that focused primarily on women at high risk for breast cancer (including those with preexisting breast cancer or high-risk breast lesions [such as ductal carcinoma in situ, atypical hyperplasia, and lobular carcinoma in situ], BRCA mutations, familial breast cancer syndromes, or previous chest-wall radiation) and studies that included women with nondense breasts, we analyzed the relevant subset when available in the publication or provided by the authors.
A priori inclusion criteria limited studies on BI-RADS reproducibility to fair- or good-quality randomized, controlled trials; cohort studies; or test sets involving multiple blind readings by at least 3 readers. Studies on test performance characteristics and outcomes of supplemental screening modalities were limited to fair- or good-quality randomized, controlled trials; cohort studies; or diagnostic accuracy studies with reference standards applied to all participants. We examined sensitivity, specificity, positive predictive values (PPVs), negative predictive values (NPVs), and available clinical outcomes (including cancer detection rates, recall rates, and biopsy rates). We defined recall as the need for any additional diagnostic testing after supplemental screening, including imaging and biopsy.
Two investigators (E.P.W. and J.H.T. for KQ 1, J.M. and J.J.F. for KQs 2 to 4) critically appraised all included studies independently using the U.S. Preventive Services Task Force's (USPSTF's) design-specific criteria (17), supplemented with the National Institute for Health and Clinical Excellence methodology checklists (18) and the Quality Appraisal Tool for Studies of Diagnostic Reliability (19). According to USPSTF criteria, a good-quality study generally met all prespecified criteria; fair-quality studies did not meet all criteria but had no important limitations. Poor-quality studies had important limitations that could invalidate results (inadequate or biased application of reference standard; population limited to very high-risk patients).
When available or provided by the authors, results of supplemental screening for subgroups of women with dense breasts were extracted; we excluded those with other risk factors for breast cancer. We calculated the sensitivity and specificity of the supplemental breast screening tests for women with negative mammography results. Only cancers detected by the supplemental test after negative mammography results and cancers found at interval follow-up were included. Hence, the values reported represent the sensitivity and specificity for detection of additional cancer in women with negative mammography findings. Similarly, we defined cancer detection rates, recall rates, and biopsy rates to include only those cancer cases, recalls, and biopsies related to supplemental screening after negative results on mammography. Meta-analysis was not performed because there were few good-quality studies.
This research was funded by the Agency for Healthcare Research and Quality (AHRQ) under a contract to support the work of the USPSTF. The investigators worked with USPSTF members to develop and refine the scope, analytic frameworks, and KQs. AHRQ had no role in study selection, quality assessment, synthesis, or development of conclusions. AHRQ provided project oversight; reviewed the draft report; and distributed the draft for peer review, including to representatives of professional societies and federal agencies. AHRQ performed a final review of the manuscript to ensure that the analysis met methodological standards. The investigators are solely responsible for the content and the decision to submit the manuscript for publication.
The literature search yielded 2067 unique citations; 128 full-text articles considered potentially relevant were reviewed to identify 24 unique studies meeting inclusion criteria (Appendix Figure 2). Table 1 (20–43) provides the characteristics of included studies. No studies addressed the effect of supplemental screening (compared with women without supplemental screening) on breast cancer morbidity or mortality.
Summary of evidence search and selection.
KQ = key question.
Table 1. Characteristics of Included Studies
Absent a gold standard for breast density, studies could not evaluate the accuracy of BI-RADS density determinations. Five studies reported repeated assignment of categorical BI-RADS breast density classification by the same or different radiologists, altogether including more than 440 000 women, almost all with data from 2 sequential screening mammograms. To reflect current U.S. practice, we included only studies based on the BI-RADS density categories. The 3 largest studies were set in the United States. Two used data from the Breast Cancer Surveillance Consortium (20, 22), and the third presented findings from community radiologists conducting repeated readings of a large screening test set (24). Two other small studies (not discussed here) were based on mammographic screening programs in Spain (21) and Italy (23). All United States–based studies reflected community practice by use of clinical readings from community screening programs or test set readings by practicing community radiologists without additional training.
Overall, group prevalence of BI-RADS density ratings was similar across initial and subsequent examinations among community radiologists (Appendix Table 2), but there was greater disagreement at the individual level. On subsequent screening examinations, approximately 1 in 5 women (23%) was placed in a different BI-RADS density category (a, b, c, d) by the same radiologist, while approximately 1 in 3 was categorized differently when a different radiologist read the subsequent examination result (Table 2). Considering clinical interpretations that combine categories (“dense” representing those with BI-RADS c or d and “nondense” representing BI-RADS a or b), 13% to 19% of women were reclassified into a different breast density category on their subsequent screening mammogram (Table 2).
Appendix Table 2. Consistency of Breast Imaging Reporting and Data System Density Categories and Population Categorization
Table 2. Potential Misclassification of Breast Imaging Reporting and Data System Density Categorization by Density Categories
These average estimates do not reflect greater extremes seen among outlier radiologists. Among 34 community radiologists reading sequential examination results in the same women (22), readers assigned the same BI-RADS density assessment on both mammograms 77% of the time, on average; however, individual readers' agreement between repeated ratings ranged from 62% to 87% (data not shown). In a study assessing repeat as well as cross-reader assignment of BI-RADS density categories by 19 radiologists in a test set of 341 examinations, radiologists assigned the same BI-RADS density assignment 82% of the time, on average, although individual readers varied from 66% to 95% (24).
In community settings, 19% to 22% of examinations initially classified as dense were subsequently reclassified as nondense, whereas 10% to 16% of initially nondense examinations were reclassified as dense (Table 2). In contrast, initial clinical readings for a test set showed a higher percentage reclassified from nondense to dense than vice versa. Across studies, the most commonly assigned breast density categories (b or c) were also those most likely to be reclassified on subsequent examination (Table 2), representing a clinical reclassification between nondense and dense. Radiologists tended to agree with their own previous assessments of density better than with those made by other readers, although there was substantial variability among pairs of readers due to outliers (more details in full report ). These results apply most to postmenopausal women or those aged 50 years and older because these women made up 71% to 100% of the study samples.
Nine studies reported test performance characteristics for supplemental screening with HHUS, ABUS, and MRI among women with negative mammography results (Table 2 and Appendix Figures 3 and 4). No studies reported test performance characteristics of DBT for women with dense breasts.
Sensitivity of supplemental screening with HHUS, ABUS, and MRI in detecting breast cancer.
These estimates include ductal carcinoma in situ and invasive cancers. ABUS = automated whole-breast ultrasonography; HHUS = hand-held ultrasonography; MRI = magnetic resonance imaging.
* Good-quality study.
Specificity of supplemental screening with HHUS, ABUS, and MRI in detecting breast cancer.
These estimates include ductal carcinoma in situ and invasive cancer. ABUS = automated whole-breast ultrasonography; HHUS = hand-held ultrasonography; MRI = magnetic resonance imaging.
Two good-quality studies (from the United States  and Italy ) and 3 fair-quality studies (29, 30, 34) reported on HHUS, and 1 fair-quality study from the United States (37) reported on ABUS (Table 3). We found no studies reporting variation in performance of these modalities by patient age and other breast cancer risk factors among women with dense breasts. Both good-quality studies applied consistent reference standards to identify interval cancer and included more than 1000 women. The Italian study included women who self-referred to a charity-funded breast clinic and reported findings separately by breast density category. The U.S. study included only women with dense breasts, but many women also had additional major risk factors. The authors provided data for the subset of women without major risk factors. Additional details on all included studies are found in the full report (16).
Table 3. Test Performance Characteristics for Supplemental HHUS, ABUS, and MRI
Among women with dense breasts after recent negative results on screening mammography, the sensitivity of HHUS in the 2 good-quality studies for detecting all breast cancer (including ductal carcinoma in situ and invasive cancer) ranged from 0.80 (95% CI, 0.65 to 0.91) to 0.83 (CI, 0.59 to 0.96) (25, 26), and specificity ranged from 0.86 (CI, 0.85 to 0.88) to 0.95 (CI, 0.94 to 0.95) (25, 26). Sensitivity and specificity for invasive cancers were similar (25, 26). PPV in the good-quality studies ranged from 0.03 to 0.08; NPV was 0.99 (25, 26). A single fair-quality study found that ABUS had performance characteristics similar to those of HHUS among women with dense breasts and negative mammography results (37).
Three good-quality studies (25, 38, 39) reported test characteristics of supplemental MRI screening (Table 3). These studies included many women with elevated risk for breast cancer. In 2 studies, authors provided us with unpublished data for the subgroup of women with dense breasts, excluding women at very high risk because of BRCA1/2 mutations, chest radiation, or personal histories of breast cancer (25, 39). In both, women had also recently had negative findings on screening with HHUS. The third study included stratified results based on risk factors (38).
Among these subgroups of lower-risk women with dense breasts, the sensitivity of MRI screening (after negative mammography results) for all breast cancer ranged across studies from 0.75 (CI, 0.35 to 0.97) to 1.00 (CI, 0.59 to 1.00) (25, 38, 39). Specificity also varied, ranging from 0.78 (CI, 0.73 to 0.83) (25) to 0.93 (CI, 0.87 to 0.97) (39). PPV ranged from 0.03 to 0.33 and NPVs were 0.99 to 1.00.
In general, supplemental screening after negative results on screening mammography consistently detected additional cases of breast cancer, most of which were invasive. Eighteen studies reported rates of additional cancer detected, and most also reported recall and biopsy rates associated with supplemental screening (Table 4 and Appendix Figure 5). With the possible exception of DBT, supplemental testing led to many additional recalls and biopsies.
Table 4. Breast Cancer Detection Outcomes for Supplemental HHUS, ABUS, MRI, and DBT
Breast cancer detection rates of supplemental screening with HHUS, ABUS, MRI and DBT.
These estimates include ductal carcinoma in situ and invasive cancer. ABUS = automated whole-breast ultrasonography; DBT = digital breast tomosynthesis; HHUS = hand-held ultrasonography; MRI = magnetic resonance imaging.
Seven studies reported HHUS cancer detection rates (27, 28, 31–33), and 3 studies reported on ABUS (35–37). The two good-quality studies of HHUS consistently estimated an all-cancer detection rate after negative mammography findings of 4.4 per 1000 examinations (CI, 2.5 to 7.2) (25, 26), with invasive cancer making up 93% (25) and 88% (26) of detected cancers. In the same women, mammography cancer detection rates were 4.7 per 1000 examinations in the U.S. study (25) and 2.8 per 1000 examinations in the Italian study (26). Only the U.S. study reported the recall rate for supplemental HHUS: 14% (CI, 12.7% to 15.1%) (25).
Three fair-quality studies reported cancer detection rates for ABUS. Cancer detection rates after negative results on mammography ranged from 1.9 to 15.2 per 1000 examinations (36, 37). In comparison, the cancer detection rate from mammography alone in 1 of these studies was 4.3 per 1000 examinations (37). Recall rates varied between the studies from 2% (CI, 1.1% to 2.0%) to 14% (CI, 12.9% to 14.0%) (35, 36).
In 3 good-quality studies of MRI after negative mammography results, breast cancer detection rates varied from 3.5 (CI, 1.3 to 7.6) to 28.6 (CI, 5.9 to 81.2) per 1000 examinations (25, 38, 39), with small numbers of cancer cases detected (range, 2 to 7). In comparison, rates of mammography cancer detection in 2 of these studies for women with dense breasts were 4.1 and 7.0 per 1000 examinations (25, 38). Invasive breast cancer made up 67% and 86% of detected cancer, as reported by 2 studies (25, 39). Notably, women in these studies probably had higher breast cancer risk than the general population of women with dense breasts. A good-quality U.S. study evaluated supplemental HHUS and MRI among 334 women without BRCA mutations or previous breast cancer; after 3 screening rounds with negative mammography and HHUS results over 24 months, screening breast MRI identified 6 additional cases of invasive cancer (25).
Recall rates ranged from 9% (CI, 4.0% to 15.7%) to 23% (CI, 18.9% to 28.3%); the rate was highest in the study with 3 rounds of screening (25, 39). Biopsy rates were not reported separately for subgroups of women without increased risk. Because 2 of the studies reported on only 1 round of screening, the cumulative effect of recall for additional imaging and biopsy would likely increase with additional screening rounds.
Four fair-quality studies of DBT (3 in the United States [41–43] and 1 in Italy ) reported on screening populations of women with dense breasts. All U.S. studies were single-site, retrospective studies, and generally focused on outcomes before and after DBT introduction. In 1 study, breast cancer risk among women was described as above average (41); other studies did not report on risk factors (40, 42, 43). Three studies reported cancer detection rates with digital mammography alone ranging from 4.0 to 5.2 per 1000 examinations (40, 42, 43). With DBT, combined detection ranged from 5.4 (CI, 3.5 to 7.9) to 6.9 (CI, 4.8 to 9.6) per 1000 examinations (42, 43). A single study reported that 67% of cancer cases detected with combined DBT and mammography were invasive, the same proportion as with mammography alone (42). Recall rates with DBT in 3 retrospective U.S. studies ranged from 7% (CI, 6.2% to 7.7%) to 11% (CI, 10.0% to 11.7%), compared with 9% (CI, 8.4% to 11.0%) to 17% (CI, 15.0% to 18.2%) with digital mammography alone (41–43).
Only 1 study, a good-quality Canadian randomized, controlled trial, examined the effects of notifying women with normal screening results that their mammograms showed dense breasts (44). Women randomly assigned to the intervention group (n = 285) received a report of their breast density with letters summarizing their mammography results and a pamphlet on breast cancer risk factors, including density. No supplemental screening was recommended. Women randomly assigned to the control group (n = 333) were notified of mammography results without information on breast density. At 4 weeks, more women in the intervention group had statistically significantly increased knowledge of breast density (25% in the intervention group vs. 8% in the control group) and were more likely to perceive themselves as having elevated breast cancer risk. These differences did not persist at 6 months. Psychological distress, breast cancer worry, and preoccupation with breast cancer did not differ between groups.
In studies of supplemental screening with HHUS and ABUS, more than 90% of positive test results were false-positive, and in MRI studies 66% to 97% of all positive test results were false-positives. Although no studies specifically addressed harms of supplemental screening in women with dense breasts, harms stemming from false-positive results are likely to be at least equivalent to those from mammography (45). We found no studies of whether focus on breast density distracts from assessment of other risk factors for breast cancer. Use of gadolinium contrast required for breast MRI has been associated with nephrogenic systemic fibrosis in patients with acute kidney injury or chronic kidney disease, but we found no reports of this adverse effect specifically related to breast MRI. The ACR recommends screening with serum creatinine before administration of gadolinium for those aged 60 years and older with hypertension, diabetes, or history of renal disease (46). Harms from DBT could come from additional breast radiation exposure (40–43, 47).
We examined the consistency of categorical BI-RADS breast density determinations in U.S. community practices because this is the system recommended by the ACR and written into most of the legislative mandates. According to large, community practice–based studies, BI-RADS density assessments at a population level were generally consistent across sequential examinations by the same or different readers, but there was important variability among readings for individual women. Approximately 80% of examinations received a b or c BI-RADS density assessment; these categories were also most likely to be reassessed differently, whether on a separate reading of the same examination or on a subsequent examination, and whether read by the same or a different reader. As a result, across studies a sizeable 13% to 19% of women (13–19) were reclassified from “nondense” to “dense” or vice versa. In these instances, mandated communications about elevated breast cancer risk or the need for additional clinical screenings could provide inconsistent information for the same woman in the span of 2 to 3 years.
Breast density findings can change because of multiple factors related to the woman being examined, the qualitative nature of the technique, and radiologist variability in interpretation of the examinations. The studies we examined tried to control for within-woman biological factors, suggesting that most of the variation in breast density assessment reflects within- and between-radiologist variability in density interpretation and the limitations of the current BI-RADS approach. Concerns about BI-RADS breast density determinations are a major impetus for research examining other methods for assigning breast density, including automated volumetric estimates, ultrasonographic assessments, and other computer-assisted methods. Although variability is reduced by use of double readings, which is widely practiced in Europe (40), this approach is impractical in the United States because of workforce requirements. The introduction of standards and quality measures related to breast density categorization could help to minimize potential harms associated with variable breast density categorizations.
When combined with mandated direct-to-consumer communications, variability in breast density assignments may lead to unintended consequences. Reclassification from one overall category to another (for example, “dense” to “not-dense” or vice versa) may undermine a woman's confidence in the screening process and leave her uncertain about her risk for breast cancer, whereas the opposite reclassification may alarm women unnecessarily or prompt supplemental screening tests of uncertain value. The ACR has publicly expressed similar cautions about benefits, possible harms, and unintended consequences for the communication of breast density assessments to women (48).
Few studies evaluated test performance of supplemental screening tests for women with dense breasts. In the studies identified, the sensitivity of supplemental MRI screening after negative screening mammography results appeared generally higher than that seen with HHUS screening. However, although we examined subsets of women without specific risk factors, we suspect that, in general, these women were at higher risk. Studies of MRI were small and variable in their sensitivity estimates. No study directly compared sensitivity of supplemental screening modalities among women with dense breasts. Specificity of supplemental screening modalities was similar, and PPV was low. We identified only one study of ABUS and no studies of DBT test performance in women with dense breasts. No studies examined the effects of age or other breast cancer risk factors on supplemental test performance characteristics in women with dense breasts. No studies reported on breast cancer morbidity and mortality outcomes.
Evidence on harms of supplemental screening was also sparse. Added to digital mammography, DBT more than doubles the radiation exposure from each screening examination (49–51). New estimates of cancer induced by radiation from breast imaging have recently been reported (47). Technology that allows reconstruction of the 2-dimensional breast images can reduce radiation exposure but is not widely disseminated (49). We found no reports of adverse effects from use of gadolinium contrast for breast MRI, but a tracking mechanism for this potentially severe, albeit rare, adverse effect should be considered. Potential harms resulting from overdiagnosis of breast cancer through supplemental screening can be identified only through rigorous prospective studies with long-term follow-up.
Our review was limited to studies published in English; studies published in other languages may have met inclusion criteria, although applicability to U.S. practice could be limited. For applicability and feasibility concerns, we focused only on BI-RADS breast density assessment. Studies did not examine the underlying reasons for variability in BI-RADS assessment within or between radiologists, nor did they evaluate any interventions to reduce the variability. The number, quality, and rigor of studies of diagnostic test characteristics and clinical outcomes were limited. Most studies lacked a complete reference standard, sufficient follow-up, or a clear description of follow-up, so diagnostic test performance characteristics could not be evaluated. Recall was often not clearly defined. No studies compared interval breast cancer rates, stage at diagnosis, or breast cancer mortality among two groups of women with dense breasts undergoing screening mammography with or without supplemental testing. No studies addressed the important potential risks of overdiagnosis and the associated harms of unnecessary treatment. Many studies included mixtures of women at increased breast cancer risk due to risk factors other than breast density, limiting the generalizability to the general screening population of women with dense breasts. Literature on ABUS and DBT for women with dense breasts was limited, as was literature on the harms of breast density notification. Only 1 comparative study of cohorts with and without supplemental screening adjusted for differences between cohorts (42).
In conclusion, good-quality studies with U.S. radiologists show important reclassification between dense and nondense breasts in women undergoing sequential screening examinations. Reclassification of breast density may introduce confusion or reduce confidence among women. Moving from a “dense” to a “nondense” breast categorization may result in different mandated communications in states with breast density notification, as well as fluctuation in clinical recommendations for supplemental screening.
Limited evidence suggests that more breast cancer cases will be detected by supplemental HHUS and MRI screening of women with dense breasts, and most detected breast cancer cases will be invasive. Studies have not evaluated whether diagnosis of additional breast cancer by supplemental screening leads to improved clinical outcomes or what proportion of the cancer diagnosed represents overdiagnosis. Supplemental testing of women with dense breasts with HHUS or MRI is associated with increased recall rates for diagnostic investigation among women without breast cancer. Use of DBT may be associated with lower recall rates, but studies are few and retrospective. To define meaningful clinical outcomes of supplemental screening of women with dense breasts, well-designed, long-term, prospective, comparative studies of supplemental screening are needed.
The In the Clinic® slide sets are owned and copyrighted by the American College of Physicians (ACP). All text, graphics, trademarks, and other intellectual property incorporated into the slide sets remain the sole and exclusive property of the ACP. The slide sets may be used only by the person who downloads or purchases them and only for the purpose of presenting them during not-for-profit educational activities. Users may incorporate the entire slide set or selected individual slides into their own teaching presentations but may not alter the content of the slides in any way or remove the ACP copyright notice. Users may make print copies for use as hand-outs for the audience the user is personally addressing but may not otherwise reproduce or distribute the slides by any means or media, including but not limited to sending them as e-mail attachments, posting them on Internet or Intranet sites, publishing them in meeting proceedings, or making them available for sale or distribution in any unauthorized form, without the express written permission of the ACP. Unauthorized use of the In the Clinic slide sets will constitute copyright infringement.
Stéphanie V. de Lange MD, Marije F. Bakker PhD, Ruud M. Pijnappel MD PhD, Wouter B. Veldhuis MD PhD, Carla H. van Gils PhD
The Julius Center for Health Sciences and Primary Care (S.V.d.L, M.F.B, C.H.v.G) and Department of Radiology (R.M.P, W.B.V.), University Medical Center Utrecht. Utrecht, the Netherlands
February 10, 2016
Conflict of Interest:
Stéphanie V. de Lange: disclosed no relevant relationships<br/><br/>Marije F. Bakker: disclosed no relevant relationships<br/><br/>Ruud M. Pijnappel: Activities related to the present article: none to disclose. Activities not related to the present article: none to disclose. Other relationships: is a non-compensated member of the scientific board of Hologic.<br/> <br/>Wouter B. Veldhuis: disclosed no relevant relationships<br/><br/>Carla H. van Gils: C. van Gils reports a grant from the European Union’s Seventh Framework Programme (FP7), a grant from Bayer Healthcare and a personal grant from the Dutch Cancer Society during the conduct of the study. She also reports non-financial support from Volpara Solutions. <br/><br/>The DENSE trial is supported by the University Medical Center Utrecht (project number UMCU DENSE), the Netherlands Organization for Health Research and Development (project number ZONMW-200320002-UMCU), the Dutch Cancer Society (project numbers DCS-UU-2009-4348 and UU-2014-6859), the Dutch Pink Ribbon/A Sister's Hope (project number Pink Ribbon-10074), Bayer HealthCare Medical Care (project number BSP-DENSE), and Stichting Kankerpreventie Midden-West. For research purposes, Matakina (Wellington, New Zealand) provided Volpara Imaging Software, version 1.5 for installation on servers in the screening units of the Dutch screening program.
Supplemental breast cancer screening in women with dense breasts
TO THE EDITOR:Melnikow et al. conclude that although supplemental screening of women with dense breasts finds additional breast cancer, the effects on breast cancer outcomes remain unclear (1). The authors mention a lack of comparative studies with interval breast cancer rates, stage at diagnosis or breast cancer mortality as the outcome. We know of two randomized controlled trials on supplemental ultrasound screening. As mentioned in the accompanying editorial (2), the Japanese multicenter J-START study was probably too recent to have been included. In this RCT, with approximately 60% of participants having dense breasts, cancer detection rates were higher and interval cancer rates statistically significantly lower in the ultrasound plus mammography group, compared to the mammography-only group (3). A Chinese RCT (4) included women that were not specifically selected on breast density either, but more than 65% was expected to have dense breasts. In the group with ultrasound and mammography, significantly more breasts cancers were detected than in the mammography-only group. No interval tumors were observed in either group, which may be partly explained by the combination of a relatively low overall breast cancer incidence rate and loss-to-follow-up. Together with clinicians and researchers from 8 hospitals, we are currently conducting a third RCT, DENSE. This trial investigates the value of additional MRI compared to usual screening practice, in women with extremely dense breasts and a negative digital mammography. A description of its design has been published last year (5). It has many of the characteristics called for by Melnikow (1) and Berg (2). Women are included solely on the basis of their breast density. A fully automatic and validated method was used to estimate mammographic density (Volpara Imaging Software; Matakina). The primary outcome is the difference in interval cancer rates between the two arms, as the best proxy for a difference in breast cancer mortality. Information on molecular phenotypes, cancer stage and other outcomes is being collected, as well as information on potential harms, including impact of MRI screening on quality of life. Influence of age and other breast cancer risk factors on performance of supplemental screening MRI will be examined.Currently all participants have been recruited. More than 4,700 MRI examinations have been carried out, and results on interval cancers are expected in 2018. These results will provide comparative evidence on the benefits and harms of supplemental MRI breast cancer screening in women with extremely dense breasts. References1. Melnikow J, Fenton JJ, Whitlock EP, Miglioretti DL, Weyrich MS, Thompson JH, et al. Supplemental Screening for Breast Cancer in Women With Dense Breasts: A Systematic Review for the U.S. Preventive Services Task ForceSupplemental Breast Cancer Screening in Women With Dense Breasts. Ann Intern Med [Internet]. 2016;N/A(N/A):N/A – N/A. Available from: http://dx.doi.org/10.7326/M15-17892. Berg WA. Supplemental Breast Cancer Screening in Women With Dense Breasts Should Be Offered With Simultaneous Collection of Outcomes Data. Ann Intern Med [Internet]. 2016;(9):9–10. Available from: http://annals.org/article.aspx?doi=10.7326/M15-29773. Ohuchi N, Suzuki A, Sobue T, Kawai M, Yamamoto S, Zheng Y-F, et al. Sensitivity and specificity of mammography and adjunctive ultrasonography to screen for breast cancer in the Japan Strategic Anti-cancer Randomized Trial (J-START): a randomised controlled trial. Lancet [Internet]. Elsevier Ltd; 2015;6736(15):1–8. Available from: http://linkinghub.elsevier.com/retrieve/pii/S01406736150077464. Shen S, Zhou Y, Xu Y, Zhang B, Duan X, Huang R, et al. A multi-centre randomised trial comparing ultrasound vs mammography for screening breast cancer in high-risk Chinese women. Br J Cancer [Internet]. 2015 Feb 10 [cited 2015 Feb 11];(February):998–1004. Available from: http://www.ncbi.nlm.nih.gov/pubmed/256680125. Emaus MJ, Bakker MF, Peeters PH, Loo CE, Lobbes MBI, Pijnappel RM, et al. MR imaging as an additional screening modality for the detection of breast cancer in women aged 50-75 years with extremely dense breasts: the DENSE trial study design. Radiology. 2015;277(2):527–37.
Hematology/Oncology, Breast Cancer, Cancer Screening/Prevention, Prevention/Screening.
Results provided by:
Copyright © 2016 American College of Physicians. All Rights Reserved.
Print ISSN: 0003-4819 | Online ISSN: 1539-3704
Conditions of Use
This PDF is available to Subscribers Only