Wendy Bruening, PhD; Joann Fontanarosa, PhD; Kelley Tipton, MPH; Jonathan R. Treadwell, PhD; Jason Launders, MSc; Karen Schoelles, MD, SM
Disclaimer: The authors of this report are responsible for its content. Statements in the report should not be construed as endorsements by the Agency for Healthcare Research and Quality or the U.S. Department of Health and Human Services.
Acknowledgment: The ECRI Institute Evidence-based Practice Center thanks Eileen Erinoff, MSLIS, and Helen Dunn for providing literature retrieval and documentation management support and Lydia Dharia and Katherine Donahue for their assistance with the preparation of the manuscript.
Grant Support: This project was supported by the ECRI Institute Evidence-based Practice Center, Plymouth Meeting, Pennsylvania, with funding from the Agency for Healthcare Research and Quality under contract no. 290-02-0019, U.S. Department of Health and Human Services.
Potential Conflicts of Interest: Drs. Bruening, Fontarosa, and Schoelles and Mr. Launders have disclosed the following: Grants received/pending (money to institution): Agency for Healthcare Research and Quality, U.S. Department of Health and Human Services. Disclosures can also be viewed at www.acponline.org/authors/icmje/ConflictOfInterestForms.do?msNum=M09-1480.
Requests for Single Reprints: Karen Schoelles, MD, SM, ECRI Institute, 5200 Butler Pike, Plymouth Meeting, PA 19462-1298; e-mail, firstname.lastname@example.org.
Current Author Addresses: Drs. Bruening, Fontanarosa, Treadwell, and Schoelles; Ms. Tipton; and Mr. Launders: ECRI Institute, 5200 Butler Pike, Plymouth Meeting, PA 19462-1298.
Author Contributions: Conception and design: W. Bruening, J.R. Treadwell, K. Schoelles.
Analysis and interpretation of the data: W. Bruening, J. Fontanarosa, J.R. Treadwell, K. Schoelles.
Drafting of the article: W. Bruening, J. Fontanarosa.
Critical revision of the article for important intellectual content: J.R. Treadwell, J. Launders, K. Schoelles.
Final approval of the article: W. Bruening, J. Fontanarosa, J.R. Treadwell, K. Schoelles.
Statistical expertise: W. Bruening, J.R. Treadwell, K. Schoelles.
Administrative, technical, or logistic support: W. Bruening.
Collection and assembly of data: J. Fontanarosa, K. Tipton.
Bruening W, Fontanarosa J, Tipton K, Treadwell JR, Launders J, Schoelles K. Systematic Review: Comparative Effectiveness of Core-Needle and Open Surgical Biopsy to Diagnose Breast Lesions. Ann Intern Med. 2010;152:238-246. doi: 10.7326/0003-4819-152-1-201001050-00190
Download citation file:
Published: Ann Intern Med. 2010;152(4):238-246.
Most women undergoing breast biopsy are found not to have cancer.
To compare the accuracy and harms of different breast biopsy methods in average-risk women suspected of having breast cancer.
Databases, including MEDLINE and EMBASE, searched from 1990 to September 2009.
Studies that compared core-needle biopsy diagnoses with open surgical diagnoses or clinical follow-up.
Data were abstracted by 1 of 3 researchers and verified by the primary investigator.
33 studies of stereotactic automated gun biopsy; 22 studies of stereotactic-guided, vacuum-assisted biopsy; 16 studies of ultrasonography-guided, automated gun biopsy; 7 studies of ultrasonography-guided, vacuum-assisted biopsy; and 5 studies of freehand automated gun biopsy met the inclusion criteria. Low-strength evidence showed that core-needle biopsies conducted under stereotactic guidance with vacuum assistance distinguished between malignant and benign lesions with an accuracy similar to that of open surgical biopsy. Ultrasonography-guided biopsies were also very accurate. The risk for severe complications is lower with core-needle biopsy than with open surgical procedures (<1% vs. 2% to 10%). Moderate-strength evidence showed that women in whom breast cancer was initially diagnosed by core-needle biopsy were more likely than women with cancer initially diagnosed by open surgical biopsy to be treated with a single surgical procedure (random-effects odds ratio, 13.7 [95% CI, 5.5 to 34.6]).
The strength of evidence was rated low for accuracy outcomes because the studies did not report important details required to assess the risk for bias.
Stereotactic- and ultrasonography-guided core-needle biopsy procedures seem to be almost as accurate as open surgical biopsy, with lower complication rates.
Agency for Healthcare Research and Quality.
There are several different methods of performing breast biopsies.
This systematic review compared open surgical biopsy and core-needle biopsy (CNB) techniques for diagnosing cancer in women with a palpable or nonpalpable breast abnormality. Multiple studies suggested that stereotactic and ultrasonography-guided CNB were almost as accurate as open biopsy and that CNB had a lower risk for complications. Also, women with cancer diagnosed by CNB were more often treated with a single surgical procedure than were women with disease that was initially diagnosed by open biopsy.
Details of the accuracy studies were poorly reported, which made it difficult to evaluate the validity of findings.
Breast cancer is the second most common type of cancer in women, with more than 180 000 new cases diagnosed each year in the United States (1). Women suspected of having breast cancer are usually referred for breast biopsy to determine whether the lesion is benign or malignant and whether further treatment is needed. Most women who are referred for breast biopsy do not have malignant lesions and do not require follow-up treatment (2). Biopsies may be performed by open surgery (excisional or incisional biopsy) or by minimally invasive core-needle methods. Core-needle biopsy (CNB) involves removing small samples of breast tissue through a hollow large-core needle inserted through the skin. Minimally invasive CNB has fewer complications and a shorter recovery time than open surgical biopsy. However, women and clinicians may have concerns about the consequences of misdiagnoses based on inaccurate CNB results that could be avoided by using open surgical biopsy.
There are many different ways to perform CNB in current clinical practice. The suspicious lesion may be located by palpation or by imaging (stereotactic mammography, ultrasonography, or magnetic resonance imaging [MRI]). A device that uses vacuum suction to assist in removing tissue samples is sometimes used. Different sizes of needles may be used, and different numbers of samples may be taken. The relative accuracy of the various CNB methods, and their accuracy compared with open surgical biopsy in diagnosing suspicious breast lesions, is unclear. A sufficiently accurate method of performing minimally invasive CNB may allow many women to avoid surgery and reduce the number of surgical procedures that women with cancer undergo during treatment.
We sought to evaluate the accuracy and safety of breast biopsy methods, determine what factors may affect the accuracy of these methods, and explore their possible harms.
The Agency for Healthcare Research and Quality commissioned this review; a technical report that describes all methods and results, including additional analyses, is published elsewhere (3). We developed and followed a standard protocol for the review. We focused on the use of CNB to evaluate average-risk women with screening-detected suspected primary cancer confined to the breast. Fine-needle aspiration is not considered or discussed.
The topic was nominated in a public process. A technical expert panel provided input on key steps, including selection of the questions to be examined and the protocol for data analysis and interpretation. The draft key questions were finalized after being posted for public comment. This systematic review was commissioned to address the following key questions:
1. In women with a palpable or nonpalpable breast abnormality, what is the accuracy of different types of core-needle breast biopsy compared with open biopsy for diagnosis?
2. In women with a palpable or nonpalpable breast abnormality, what are the harms associated with different types of core-needle breast biopsy compared with open biopsy in the diagnosis of breast cancer?
We searched bibliographic databases, including MEDLINE, EMBASE, the Cochrane Library, and CINAHL, to identify clinical trials and other information published between 1990 and 10 November 2008; the searches of MEDLINE and EMBASE were updated to September 2009 (Appendix Table 1). The major search terms and concepts searched included (but are not limited to) the following: biopsy, breast biopsy, breast diseases, breast cancer, breast tumor, excision, incisional, large core, Mammotome, needle biopsy, percutaneous biopsy, stereotactic breast biopsy, and surgery. Appendix Table 2 provides a complete list of search terms and strategies.
Appendix Table 1.
Appendix Table 2.
For the analysis of accuracy, we included studies of diagnostic test performance that met the following a priori criteria: direct comparison of CNB with open surgical biopsy or clinical follow-up for 6 months or longer in the same group of patients; enrollment of 10 or more women referred for biopsy for the purpose of primary diagnosis of a breast abnormality; full data reported for at least 50% of the patients originally enrolled in the study (to avoid bias due to excessive or differential attrition); and published as an English-language, full-length, peer-reviewed article. Case–control studies were excluded because they have been shown to overestimate the accuracy of diagnostic tests (4). We excluded studies that used biopsy instruments that are no longer available and studies that enrolled women thought to be at very high risk for breast cancer because of family history, personal history, or BRCA mutations. The Appendix Figure shows the study selection process.
CNB = core-needle biopsy; FNA = fine-needle aspiration; KQ = key question.
Abstracts of articles identified by the literature searches were screened in duplicate for possible relevance by 3 research assistants. The principal investigator approved all exclusions at the abstract level. The full-length articles of studies that seemed relevant at the abstract level were then obtained, and 3 research assistants examined the articles in duplicate to see whether they met the inclusion criteria. The principal investigator resolved any conflicts. A detailed list of the excluded articles and primary reason for exclusion is available on request.
We created standardized data abstraction forms, and each reviewer entered the data into the SRS 4.0 database (Mobius Analytics, Ottawa, Ontario, Canada). One of 3 research assistants abstracted the data for each article, and the principal investigator verified the accuracy of the abstracted data.
One of 3 research assistants rated the internal validity of each of the studies by using a modified version of the Quality Assessment of Diagnostic Accuracy Studies instrument (5). Aspects of internal validity addressed by the instrument include patient recruitment being either consecutive or random, patient inclusion and exclusion criteria applied consistently, freedom from obvious spectrum bias, prospective design, completeness of data reporting, completeness of assessment of patients by the reference standard, accounting for interreader differences, and blinding of the readers and outcome assessors. Appendix Table 3 provides the full list of items.
Appendix Table 3.
We graded the strength of evidence supporting each major conclusion as high, moderate, low, or insufficient. The principal investigator determined the grade for each conclusion after considering various important domains as suggested in the Comparative Effectiveness Review Draft Methods Guide and in accordance with a strength and stability of evidence grading system developed by ECRI Institute (6, 7). Four domains were evaluated: quality (potential risk for bias, or “internal validity”) of the evidence base, precision of the evidence (measured by the CI around the summary estimates), consistency (agreement across studies) of the findings, and robustness of the findings (as determined by sensitivity analysis). We rated the potential for bias as low, moderate, or high by using the instrument for quality of the evidence as described. We rated the domains of precision, consistency, and robustness as either sufficient or insufficient. We addressed applicability of the evidence by excluding studies that enrolled patient populations that were not asymptomatic, normal-risk women participating in routine breast cancer screening programs. The domain of “directness” (whether the evidence demonstrates that the diagnostic test directly affects patient health outcomes) was not incorporated into the grade for diagnostic accuracy outcomes, because the direct impact of diagnostic tests on patient health outcomes is difficult to ascertain from the data reported by diagnostic accuracy studies.
We made 3 key assumptions: 1) the “reference standard,” open surgical biopsy or clinical and radiologic follow-up for at least 6 months, was 100% accurate; 2) the pathologists diagnosing the open surgical biopsy results were 100% accurate in diagnosing the material submitted to them; and 3) CNB diagnoses of cancer (invasive or in situ) that could not be confirmed by an open surgical procedure were assumed to have been correct where the CNB procedure completely removed the lesion (8). In addition, most studies reported data on a per-lesion rather than a per-patient basis; however, few of the enrolled women underwent biopsy of more than 1 lesion. Therefore, we analyzed the data on a per-lesion basis.
We performed 2 primary types of analyses: a standard diagnostic accuracy analysis and an analysis of underestimation rates. For the diagnostic accuracy analysis, true-negative results were defined as lesions diagnosed as benign on CNB that were found to be benign by the reference standard; false-negative results were defined as lesions diagnosed as benign on CNB that were found to be malignant (invasive or in situ) by the reference standard; true-positive results were defined as lesions diagnosed as malignant (invasive or in situ) on CNB and high-risk lesions that were found to be malignant (invasive or in situ) on the reference standard; and false-positive results were defined as lesions diagnosed as high risk (most commonly atypical ductal hyperplasia [ADH] lesions) on CNB that were found not to be malignant (invasive or in situ) by the reference standard.
We meta-analyzed the data reported by the studies by using a bivariate mixed-effects binomial regression model described by Harbord and colleagues (9). All such analyses were computed by STATA, version 10.0 (StataCorp, College Station, Texas), using the “midas” command (10). The summary likelihood ratios and Bayes theorem were used to calculate the posttest probability of having a benign or malignant lesion. In cases where a bivariate binomial regression model could not be fit (primarily owing to heterogeneity of study results), we meta-analyzed the data by using a random-effects model (11) and Meta-DiSc software (Unit of Clinical Biostatistics, Ramón y Cajal Hospital, Madrid, Spain). We performed meta-regressions with Meta-DiSc and assessed heterogeneity with the I2 measure (12, 13).
For the analysis of underestimation rates, lesions that CNB diagnosed as ductal carcinoma in situ (DCIS) but the reference standard found to be invasive were counted as underestimates. Similarly, “high-risk” lesions (most commonly ADH) that the reference standard found to be malignant (in situ or invasive) were counted as underestimates. We calculated the underestimation rate as the number of underestimates per number of DCIS (or “high risk”) diagnoses and expressed this as a percentage (the proportion of DCIS or ADH diagnoses that were underestimates).
We meta-analyzed the underestimation rates and all other outcomes with a random-effects model (11) using CMA software (Biostat, Englewood, New Jersey). We did not assess the possibility of publication bias because statistical methods developed to assess the possibility of publication bias in treatment studies have been shown to be invalid for use with studies of diagnostic accuracy (14, 15).
This project was funded by the Agency for Healthcare Research and Quality. The funding source was involved in developing the key questions and objectives of the systematic review and provided copyright release for the manuscript but did not participate in the literature search, data selection, abstraction, analysis, or interpretation of findings.
Our literature searches identified 107 studies of 57 088 breast lesions that met the inclusion criteria (Appendix Figure). All were diagnostic cohort studies that enrolled women found to have suspicious breast abnormalities on routine screening (mammography or physical examination). The women were sent for various types of breast biopsies, and the accuracy of the breast biopsy was determined by comparing the results of breast biopsy with the results of open surgery or patient follow-up. We rated the evidence as being of overall low quality (greater potential for bias) in part because of poor reporting of study and patient details (Figure). Appendix Table 4 describes these studies (16–127).
These 5 study quality measures are, in the authors' judgment, highly important for reducing the risk for bias when addressing the key questions of this review. The measures are listed in order of importance (as judged by the authors) from top to bottom. “Reported sufficient relevant clinical information” refers to whether the study reported sufficient information about the study design, patient selection and characteristics, breast lesion characteristics, and biopsy methods to fully address the key questions and fully assess the potential for bias in the study design. “Index test results blinded” refers to whether readers of the reference standard were aware of biopsy results. “Differential verification bias avoided” refers to whether the reference standard was chosen without regard to biopsy results. “Representative spectrum enrolled” refers to whether the enrolled patient population resembles the “usual” patient population seen in clinical practice. “Avoided selection bias” refers to whether the study clearly enrolled all or consecutive patients by applying consistent inclusion and exclusion criteria.
Appendix Table 4.
Thirty-three studies of 7153 biopsies used stereotactic guidance and an automated biopsy gun (16–48); 22 studies of 7512 biopsies used stereotactic guidance and a vacuum-assisted device to perform CNB (27, 31, 49–68); 7 studies of 507 biopsies used ultrasonographic guidance and a vacuum-assisted device to perform breast biopsies (69–75); and 5 studies of 610 biopsies reported data on the accuracy of nonguided (freehand) CNB performed with automated biopsy gun devices (76–80). We fit bivariate binomial models to the accuracy data reported by the studies of each type of CNB. In addition, 16 studies of 7124 biopsies used ultrasonographic guidance and an automated biopsy gun to perform CNB (73, 81–95). We did not fit a bivariate binomial model to this data set because of heterogeneity; instead, we used random-effects models to pool the reported diagnostic accuracy. In addition, we found 1 eligible study that reported data on the accuracy of CNB performed with automated biopsy guns guided by a perforated compression grid (96) and 1 eligible study that reported data on the accuracy of MRI-guided CNB performed with automated biopsy guns (97). An additional 24 studies that met the inclusion criteria used multiple CNB methods and did not report the data separately for different biopsy methods (98–121).
We did not identify any clinical studies of open surgical biopsy published since 1990 that met our inclusion criteria. We identified an article from 1998 that reviewed the accuracy of open surgical biopsy (122). Antley and colleagues reviewed the available information (published literature and patient charts available in the authors' medical center) on the accuracy of open surgical biopsy and concluded that open surgical biopsy missed 1% to 2% of cases of breast cancer (sensitivity ≥98%). The investigators based this estimate on a review of archived open biopsy material by a second pathologist, a chart review of current cases, a published study of cases of benign results on biopsy after a very suspicious mammogram, and expert opinion (123–125).
We did not identify information on estimates of underestimation rates for open surgical biopsy. However, underestimation is generally believed to be due to failure to sample all important areas of a lesion. For example, a lesion may contain a focus of carcinoma within a cluster of atypical cells. Core-needle techniques may fail to sample areas with carcinoma cells, leading to underestimation. Because open surgical biopsy samples most or all of the lesion, in theory underestimation should not occur. Therefore, we assumed that open surgical biopsy has an underestimation rate of 0 or close to 0.
Table 1 shows results of our analyses on the accuracy of different breast biopsy methods. For our conclusions on accuracy, we graded the strength of the supporting evidence as low because of the low quality of the evidence base, although we rated the quantity, consistency, and robustness as sufficient.
We recorded the complications and harms reported by the 107 studies that met the inclusion criteria (Table 2). Severe complications after CNB are very rare, occurring in fewer than 1% of procedures. Vacuum-assisted procedures may be associated with slightly more severe bleeding events than automated gun procedures. The strength of evidence supporting the quantitative estimates of the frequency of complications is low.
A particularly important finding from data reported by 31 studies was that women with breast cancer diagnosed by CNB could usually be treated with a single surgical procedure, but women with breast cancer diagnosed by open surgical biopsy often required more than 1 surgical procedure (random-effects odds ratio, 13.7 [95% CI, 5.6 to 34.6]). Because of the consistency, robustness, and great strength of association between type of biopsy and the requirement for more than 1 surgery for treatment, we rated the strength of evidence supporting this conclusion as moderate.
We also performed several meta-regressions to identify factors that affect the accuracy and harms of CNB (Appendix Tables 5 and 6). Use of image guidance and vacuum assistance improved the accuracy of CNB; however, vacuum assistance increased the percentage of procedures complicated by severe bleeding and hematoma formation. Performing biopsies with patients seated upright increased the incidence of vasovagal reactions.
Appendix Table 5.
Appendix Table 6.
Our meta-regressions did not identify a statistically significant effect of the following factors on the results: needle size, method of verification of biopsy (open surgery, open surgery and at least 6 months' follow-up, or open surgery and at least 2 years' follow-up), whether the studies were conducted at a single center or at multiple centers, whether the studies were conducted in general hospitals or dedicated cancer clinics, or the country in which the study was conducted. The studies reported insufficient information about lesions characteristics, patient characteristics, or the training or experience of the persons performing the biopsies to explore the effect of such factors on the accuracy or harms of the biopsies.
When making decisions about what type of biopsy to use, women and their health care providers must weigh the pros and cons of each biopsy type. The location and type of lesion, as well as other medical considerations, sometimes dictate the type of biopsy, but in many cases, patient preference is the most important factor in the decision. Open surgical biopsy is highly accurate; however, CNB is associated with a much lower incidence of harms and morbidity. In addition, women with cancer diagnosed by CNB undergo fewer surgeries during treatment than do women with cancer diagnosed by open biopsy.
The crux of the decision then becomes the question of whether CNB is sufficiently accurate. The answer may vary according to the individual woman's estimated prebiopsy chance of having cancer (an estimate derived from mammography results and other prebiopsy clinical history and examination information) and her desire to avoid risk. For some women, CNB will never be accurate enough to satisfy their desire to know with perceived certainty whether they do or do not have cancer. For others, the greater safety and less invasive nature of CNB is worth the very small sacrifice in accuracy.
For stereotactic-guided, vacuum-assisted CNB, the low rates of DCIS and ADH underestimation may affect treatment planning. The surgeon performing the follow-up open surgical procedure can be reasonably confident that a malignant tumor is not present, and he or she therefore may plan to remove the lesion by using a breast-conserving approach and may decide not to sample the axillary lymph nodes. Some women and physicians may decide that the ADH underestimation rate is low enough to safely substitute surveillance for an open biopsy procedure after diagnosis of ADH by CNB.
The ratings of low strength of evidence apply to the individual estimates of accuracy for each type of CNB. The poor reporting in the included studies meant that the studies may be consistently biased toward finding that CNBs are more accurate than they actually are. Consequently, we treated the studies as possibly having low internal validity. During decision making, women and health care providers must consider the clinical implications of cancer missed on CNB. In many cases, cancer will be detected on subsequent mammography follow-up, resulting in delayed diagnosis. The clinical importance of a few months' delay in diagnosing breast cancer is unclear.
Our review has limitations. Our conclusions are rated as being supported by evidence of low strength (on a 4-item scale from insufficient to high). This rating largely stems from the fact that the evidence base, while large and consistent, includes poorly reported diagnostic cohort studies and poorly reported, unblinded retrospective chart reviews. Poor reporting often made it difficult to determine whether studies were likely to be affected by bias. Studies with aspects known to introduce bias, such as a case–control design or patient selection bias, were excluded. Publication of well-designed, well-reported diagnostic accuracy studies would permit verification that our conclusions are accurate and not influenced by biases in the studies that we included.
We found that both stereotactic-guided, vacuum-assisted and ultrasonography-guided CNB are safer than open surgical biopsy and are almost as accurate as open surgical biopsy, which justifies their routine use. However, well-reported retrospective chart reviews, retrospective database analyses, or prospective diagnostic accuracy studies are needed to address unanswered questions about which factors affect the accuracy and harms of breast CNB. These answers are important for both patients and clinicians to help them decide which type of breast biopsy is best in individual cases. Additional studies of MRI-guided biopsy are needed to evaluate the accuracy and safety of MRI guidance.
In conclusion, we found the highest sensitivity for CNB methods that use stereotactic guidance, particularly in conjunction with vacuum assistance. Ultrasonography-guided CNB also has very high accuracy. In general, women who have CNB undergo fewer surgical procedures than women who initially are diagnosed by open surgical biopsy. Therefore, on the basis of currently available evidence, it seems reasonable to substitute certain CNB procedures for open surgical biopsy, given the similar sensitivity and lower complication rates for some of these percutaneous methods.
The In the Clinic® slide sets are owned and copyrighted by the American College of Physicians (ACP). All text, graphics, trademarks, and other intellectual property incorporated into the slide sets remain the sole and exclusive property of the ACP. The slide sets may be used only by the person who downloads or purchases them and only for the purpose of presenting them during not-for-profit educational activities. Users may incorporate the entire slide set or selected individual slides into their own teaching presentations but may not alter the content of the slides in any way or remove the ACP copyright notice. Users may make print copies for use as hand-outs for the audience the user is personally addressing but may not otherwise reproduce or distribute the slides by any means or media, including but not limited to sending them as e-mail attachments, posting them on Internet or Intranet sites, publishing them in meeting proceedings, or making them available for sale or distribution in any unauthorized form, without the express written permission of the ACP. Unauthorized use of the In the Clinic slide sets will constitute copyright infringement.
Hematology/Oncology, Healthcare Delivery and Policy, High Value Care, Breast Cancer, Prevention/Screening.
Results provided by:
Copyright © 2016 American College of Physicians. All Rights Reserved.
Print ISSN: 0003-4819 | Online ISSN: 1539-3704
Conditions of Use
This PDF is available to Subscribers Only