Roger Chou, MD; John L. Gore, MD, MS; David Buckley, MD, MPH; Rongwei Fu, PhD; Katie Gustafson, MD; Jessica C. Griffin, MS; Sara Grusing, BA; Shelley Selph, MD
This article was published online first at www.annals.org on 27 October 2015.
Disclaimer: The authors of this manuscript are responsible for its content. Statements in the manuscript should not be construed as endorsement by the Agency for Healthcare Research and Quality or the U.S. Department of Health and Human Services. The Agency for Healthcare Research and Quality retains a license to display, reproduce, and distribute the data and the report from which this manuscript was derived under the terms of the agency's contract with the author.
Financial Support: By the Agency for Healthcare Research and Quality; U.S. Department of Health and Human Services (contract HHSA290201200014I).
Disclosures: Dr. Chou reports grants from the Agency for Healthcare Research and Quality during the conduct of the study. Dr. Gore reports grants from the Agency for Healthcare Research and Quality during the conduct of the study. Dr. Fu reports grants from the Agency for Healthcare Research and Quality during the conduct of the study. Authors not named here have disclosed no conflicts of interest. Disclosures can also be viewed at www.acponline.org/authors/icmje/ConflictOfInterestForms.do?msNum=M15-0997.
Editors' Disclosures: Christine Laine, MD, MPH, Editor in Chief, reports that she has no financial relationships or interests to disclose. Darren B. Taichman, MD, PhD, Executive Deputy Editor, reports that he has no financial relationships or interests to disclose. Cynthia D. Mulrow, MD, MSc, Senior Deputy Editor, reports that she has no relationships or interests to disclose. Deborah Cotton, MD, MPH, Deputy Editor, reports that she has no financial relationships or interest to disclose. Jaya K. Rao, MD, MHS, Deputy Editor, reports that she has stock holdings/options in Eli Lilly and Pfizer. Sankey V. Williams, MD, Deputy Editor, reports that he has no financial relationships or interests to disclose. Catharine B. Stack, PhD, MS, Deputy Editor for Statistics, reports that she has stock holdings in Pfizer.
Requests for Single Reprints: Roger Chou, MD, Oregon Health & Science University, 3181 SW Sam Jackson Park Road, Mail Code BICC, Portland, OR 97239; e-mail, firstname.lastname@example.org.
Current Author Addresses: Drs. Chou, Buckley, Fu, Gustafson, and Selph; Ms. Griffin; and Ms. Grusing: Oregon Health & Science University, 3181 SW Sam Jackson Park Road, Mail Code BICC, Portland, OR 97239.
Dr. Gore: Department of Urology, University of Washington, Box 356510, Seattle, WA 98195.
Author Contributions: Conception and design: R. Chou, J.L. Gore, D. Buckley.
Analysis and interpretation of the data: R. Chou, J.L. Gore, R. Fu, J.C. Griffin, S. Selph.
Drafting of the article: R. Chou, R. Fu, J.C. Griffin.
Critical revision of the article for important intellectual content: R. Chou, J.L. Gore.
Final approval of the article: R. Chou, J.L. Gore, D. Buckley, R. Fu, K. Gustafson, J.C. Griffin, S. Grusing, S. Selph.
Statistical expertise: R. Fu, S. Selph.
Obtaining of funding: R. Chou.
Administrative, technical, or logistic support: R. Chou, J.C. Griffin, S. Grusing.
Collection and assembly of data: R. Chou, J.L. Gore, D. Buckley, K. Gustafson, J.C. Griffin, S. Grusing.
Chou R, Gore JL, Buckley D, Fu R, Gustafson K, Griffin JC, et al. Urinary Biomarkers for Diagnosis of Bladder Cancer: A Systematic Review and Meta-analysis. Ann Intern Med. 2015;163:922-931. doi: 10.7326/M15-0997
Download citation file:
Published: Ann Intern Med. 2015;163(12):922-931.
Published at www.annals.org on 15 December 2015
Urinary biomarkers may be a useful alternative or adjunct to cystoscopy for diagnosis of bladder cancer.
To systematically review the evidence on the accuracy of urinary biomarkers for diagnosis of bladder cancer in adults who have signs or symptoms of the disease or are undergoing surveillance for recurrent disease.
Ovid MEDLINE (January 1990 through June 2015), Cochrane Central Register of Controlled Trials, Cochrane Database of Systematic Reviews, and reference lists.
57 studies that evaluated the diagnostic accuracy of quantitative or qualitative nuclear matrix protein 22 (NMP22), qualitative or quantitative bladder tumor antigen (BTA), fluorescence in situ hybridization (FISH), fluorescent immunohistochemistry (ImmunoCyt [Scimedx]), and Cxbladder (Pacific Edge Diagnostics USA) using cystoscopy and histopathology as the reference standard met inclusion criteria. Case–control studies were excluded.
Dual extraction and quality assessment of individual studies. Overall strength of evidence (SOE) was also assessed.
Across biomarkers, sensitivities ranged from 0.57 to 0.82 and specificities ranged from 0.74 to 0.88. Positive likelihood ratios ranged from 2.52 to 5.53, and negative likelihood ratios ranged from 0.21 to 0.48 (moderate SOE for quantitative NMP22, qualitative BTA, FISH, and ImmunoCyt; low SOE for others). For some biomarkers, sensitivity was higher for initial diagnosis of bladder cancer than for diagnosis of recurrence. Sensitivity increased with higher tumor stage or grade. Studies that directly compared the accuracy of quantitative NMP22 and qualitative BTA found no differences in diagnostic accuracy (moderate SOE); head-to-head studies of other biomarkers were limited. Urinary biomarkers plus cytologic evaluation were more sensitive than biomarkers alone but missed about 10% of bladder cancer cases.
Restricted to English-language studies; no search for studies published only as abstracts; statistical heterogeneity present in most analyses; few studies for qualitative NMP22, quantitative BTA, and Cxbladder; and methodological shortcomings in almost all studies.
Urinary biomarkers miss a substantial proportion of patients with bladder cancer and are subject to false-positive results in others. Accuracy is poor for low-stage and low-grade tumors.
Agency for Healthcare Research and Quality. (PROSPERO registration number: CRD42014013284)
Bladder cancer is the fourth most commonly diagnosed cancer in U.S. men and the 10th most commonly diagnosed cancer in U.S. women (1). Standard methods for diagnosis of bladder cancer involve cytologic evaluation of urine, imaging tests, and cystoscopy (2, 3). Because cystoscopy is uncomfortable and costly, alternative diagnostic methods have been sought. Urine-based biomarkers have been developed as potential alternatives or adjuncts to standard tests for the initial diagnosis of bladder cancer or identification of recurrent disease (4).
Six urinary biomarkers have been approved by the U.S. Food and Drug Administration (FDA) for diagnosis or surveillance of bladder cancer: quantitative nuclear matrix protein 22 (NMP22) (Alere NMP22 [Alere]), qualitative NMP22 (BladderChek [Alere]), qualitative bladder tumor antigen (BTA) (BTA stat [Polymedco]), quantitative BTA (BTA TRAK [Polymedco]), fluorescence in situ hybridization (FISH) (UroVysion [Abbott Molecular]), and fluorescent immunohistochemistry (ImmunoCyt [Scimedx]). The qualitative NMP22 and BTA tests can be performed as point-of-care tests, and the others are performed in a laboratory. One additional test, Cxbladder (Pacific Edge Diagnostics USA), is a “laboratory-developed test” that does not require FDA approval. Other biomarkers have been developed but are not FDA-approved.
The purpose of this study was to systematically review the evidence on the comparative accuracy of urinary biomarkers for diagnosis of bladder cancer. It was done as part of a larger review (5) on the evaluation and treatment of non–muscle-invasive bladder cancer that was nominated to the Agency for Healthcare Research and Quality (AHRQ) by the American Urological Association for use in updating its guidelines.
Detailed methods and data for this review, including the analytic framework, key questions, search strategies, inclusion criteria, study data extraction, and quality ratings, are available in the full report (5). The protocol was developed using a standardized process (6) with input from experts and the public and is registered in the PROSPERO database (7). This article focuses on the accuracy of urinary biomarkers for initial diagnosis of bladder cancer or for diagnosis of recurrent disease, including any variance in diagnostic accuracy based on tumor characteristics, patient characteristics, or the nature of presenting signs or symptoms.
A research librarian searched multiple electronic databases, including Ovid MEDLINE (January 1990 through June 2015), the Cochrane Central Register of Controlled Trials, and the Cochrane Database of Systematic Reviews (through June 2015). We also reviewed reference lists and searched ClinicalTrials.gov.
Two investigators independently reviewed abstracts and full-text articles against prespecified eligibility criteria. We included cross-sectional and cohort studies on the diagnostic accuracy of urinary biomarkers in adults who had signs or symptoms of bladder cancer or were undergoing surveillance for recurrent disease after treatment. We focused on urinary biomarkers approved by the FDA for the diagnosis of bladder cancer (quantitative or qualitative NMP22, qualitative or quantitative BTA, FISH, and ImmunoCyt) or classified by the FDA as a laboratory-developed test (Cxbladder). We excluded studies that used a case–control design; studies that did not evaluate the diagnostic accuracy of biomarkers against standard diagnostic methods (cystoscopy and histopathology); and studies on the accuracy of biomarkers for screening in assessing prognosis, guiding therapy, or monitoring response to treatment.
One investigator extracted details about the setting, tests evaluated, definition of a positive test result, study design, reference standard, inclusion criteria, population characteristics, proportion found to have bladder cancer, bladder cancer stage and grade, results, and funding sources. We constructed 2 × 2 tables with the number of true-positive, false-positive, true-negative, and false-negative results from published sample sizes, prevalence, sensitivity, and specificity. A second investigator verified extractions for accuracy.
Two investigators independently assessed the risk of bias for each study as low, moderate, or high using criteria adapted from QUADAS-2 (Quality Assessment of Diagnostic Accuracy Studies 2) (8). Discrepancies were resolved through discussion and consensus.
We performed meta-analyses for sensitivity and specificity using a bivariate logistic mixed-effects model (9) with SAS, version 10.0 (SAS Institute) (10). We assumed random effects with a bivariate normal distribution and measured statistical heterogeneity with the random-effects variance (τ2). When few studies were available for an analysis, we used the moment estimates of correlation between sensitivity and specificity in the bivariate model. We calculated positive and negative likelihood ratios (LRs) using the summarized sensitivity and specificity (11, 12). Because studies of a particular biomarker generally used the same definition for a positive test result, we did not plot summary receiver-operating characteristic curves (13). For head-to-head comparisons, we used the same bivariate logistic mixed-effects model, with an added indicator variable for the tests.
We conducted analyses for each biomarker by using data from all patients and data stratified according to whether testing was performed for initial diagnosis (evaluation of symptoms) or diagnosis of recurrence (surveillance). We also performed analyses stratified by study design features (such as retrospective or prospective or use of a prespecified threshold to define a positive test result), risk of bias (overall and whether the study performed blinding to the results of the index test), the country in which the study was conducted, and tumor grade and stage (14).
We assessed the strength of evidence (SOE) for each body of evidence as high, moderate, low, or insufficient based on aggregate study quality, precision, consistency, and directness.
This project was funded under contract HHSA290201200014I from the AHRQ, U.S. Department of Health and Human Services. AHRQ staff assisted in developing the scope and key questions. The AHRQ had no role in study selection, quality assessment, or synthesis.
The literature flow diagram (Figure 1) summarizes the search and selection of articles. Database searches resulted in 4358 potentially relevant articles. After dual review of abstracts and titles, we selected 262 articles for full-text dual review and determined that 57 studies (in 60 publications) met our inclusion criteria (Appendix Table 1) (15-74). Nineteen studies evaluated quantitative NMP22, 4 evaluated qualitative NMP22, 23 evaluated qualitative BTA, 4 evaluated quantitative BTA, 10 evaluated FISH, 13 evaluated ImmunoCyt, and 1 evaluated Cxbladder. Sample sizes ranged from 26 to 3916, mean age ranged from 54 to 77 years, the proportion of male patients ranged from 57% to 88%, and the proportion diagnosed with bladder cancer ranged from 3% to 81%. Eight studies focused on diagnostic testing for signs and symptoms suggestive of bladder cancer, 16 focused on surveillance of previously treated bladder cancer, and 19 evaluated mixed populations. Forty-three studies were conducted in the United States or Europe. We rated 2 studies as having low risk of bias (20, 21), 3 as having high risk of bias (25, 62, 68), and the remainder as having medium risk of bias. Frequent methodological shortcomings were failure to report blinded interpretation of the reference standard, failure to report enrollment of a random or consecutive sample of patients, or failure to report predefined criteria for a positive test result.
Summary of evidence search and selection.
* Cochrane Central Register of Controlled Trials and Cochrane Database of Systematic Reviews.
† Includes prior reports, reference lists of relevant articles, and systematic reviews.
Appendix Table 1. Biomarker Study Characteristics
Appendix Table 1 — Continued
Sensitivity of quantitative NMP22 was 0.69 (95% CI, 0.62 to 0.75), and specificity was 0.77 (CI, 0.70 to 0.83) (19 studies), for a positive LR of 3.05 (CI, 2.28 to 4.10) and a negative LR of 0.40 (CI, 0.32 to 0.50) (Appendix Figure 1). Exclusion of 2 studies that used a cutoff other than >10 U/mL for a positive test result (18, 37) resulted in similar sensitivity and specificity. Diagnostic accuracy was similar for evaluation of symptoms and for surveillance. Excluding 1 study with high risk of bias (68) and restricting the analysis to prospective studies, those conducted in the United States or Europe, or those that used a prespecified threshold for a positive test result had little effect on pooled estimates. Restricting the analysis to 3 studies with blinded reference standard interpretation resulted in higher specificity (0.89 [CI, 0.78 to 0.95]) (15, 42, 58).(5)
Sensitivity and specificity of quantitative NMP22.
NMP22 = nuclear matrix protein 22; TN = true-negative; TP = true-positive.
Sensitivity of qualitative NMP22 was 0.58 (CI, 0.39 to 0.75), and specificity was 0.88 (CI, 0.78 to 0.94) (4 studies), for a positive LR of 4.89 (CI, 3.23 to 7.40) and a negative LR of 0.48 (CI, 0.33 to 0.71) (Appendix Figure 2) (20, 21, 23, 37). Restricting the analysis to 2 studies with low risk of bias resulted in similar estimates (sensitivity, 0.53 [CI, 0.29 to 0.75]; specificity, 0.87 [CI, 0.74 to 0.94]) (20, 21). Subgroup and sensitivity analyses were limited by small numbers of studies.
Sensitivity and specificity of qualitative NMP22.
Sensitivity of qualitative BTA was 0.64 (CI, 0.58 to 0.69) (22 studies), and specificity was 0.77 (CI, 0.73 to 0.81) (21 studies), for a positive LR of 2.80 (CI, 2.31 to 3.39) and a negative LR of 0.47 (CI, 0.30 to 0.55) (Appendix Figure 3). Exclusion of 3 studies with high risk of bias (25, 62, 68) resulted in similar pooled estimates. Sensitivity was higher for evaluation of symptoms (0.76 [CI, 0.67 to 0.83]) than for surveillance (0.60 [CI, 0.55 to 0.65]), but specificity was similar (0.78 [CI, 0.66 to 0.87] and 0.76 [CI, 0.69 to 0.83], respectively). Restricting analyses to studies that used a prospective design, those that were conducted in the United States or Europe, or those that reported blinded reference standard interpretation had little effect on pooled estimates.
Sensitivity and specificity of qualitative BTA.
BTA = bladder tumor antigen; TN = true-negative; TP = true-positive.
Sensitivity of quantitative BTA was 0.65 (CI, 0.54 to 0.75), and specificity was 0.74 (CI, 0.64 to 0.82) (4 studies), for a positive LR of 2.52 (CI, 1.86 to 3.41) and a negative LR of 0.47 (CI, 0.37 to 0.61) (Appendix Figure 4) (19, 27, 54, 68). Estimates were similar in 3 studies that used a threshold greater than 14 U/mL for a positive test result (27, 54, 68) and when 1 study with high risk of bias was omitted (68).
Sensitivity and specificity of quantitative BTA.
Sensitivity of FISH was 0.63 (CI, 0.50 to 0.75), and specificity was 0.87 (CI, 0.79 to 0.93) (11 studies), for a positive LR of 5.02 (CI, 2.93 to 8.60) and a negative LR of 0.42 (CI, 0.30 to 0.59) (Appendix Figure 5). Estimates were similar when 1 study with high risk of bias (25) was excluded and when the analysis was restricted to studies that used a prospective design or reported interpretation of the reference standard blinded to FISH results. For surveillance, sensitivity was 0.55 (CI, 0.36 to 0.72) and specificity was 0.80 (CI, 0.66 to 0.89). For evaluation of symptoms, sensitivity of FISH was 0.73 (CI, 0.50 to 0.88) (40, 69). Only 1 study reported specificity for evaluation of symptoms (0.95 [CI, 0.87 to 0.98]) (69).
Sensitivity and specificity of FISH.
FISH = fluorescence in situ hybridization; TN = true-negative; TP = true-positive.
Sensitivity of ImmunoCyt was 0.78 (CI, 0.68 to 0.85), and specificity was 0.78 (CI, 0.72 to 0.82) (14 studies), for a positive LR of 3.49 (CI, 2.82 to 4.32) and a negative LR of 0.29 (CI, 0.20 to 0.41) (Appendix Figure 6). Excluding 1 study with high risk of bias (71) and restricting the analysis to prospective studies had little effect on the estimates. For evaluation of symptoms, sensitivity was 0.85 (CI, 0.78 to 0.90) and specificity was 0.83 (CI, 0.77 to 0.87) (31, 35, 59, 63, 64, 66, 67); for surveillance, sensitivity was 0.75 (CI, 0.64 to 0.83) and specificity was 0.76 (CI, 0.70 to 0.81) (31, 33, 35, 60, 63, 64, 70, 73).
Sensitivity and specificity of ImmunoCyt.
TN = true-negative; TP = true-positive.
One study of Cxbladder with medium risk of bias reported sensitivity of 0.82 (CI, 0.70 to 0.90) and specificity of 0.85 (CI, 0.81 to 0.88) for evaluation of symptoms, for a positive LR of 5.53 (CI, 4.28 to 7.15) and a negative LR of 0.21 (CI, 0.13 to 0.36) (Appendix Table 2) (37).
Appendix Table 2. Test Performance of Urinary Biomarkers for Diagnosis of Bladder Cancer
Relatively few studies directly compared the diagnostic accuracy of different urinary biomarkers against cystoscopy and biopsy in the same population (Appendix Table 3). In 7 studies, there were no differences between quantitative NMP22 (based on a cutoff of >10 U/mL) and qualitative BTA in sensitivity (0.69 [CI, 0.62 to 0.76] vs. 0.66 [CI, 0.59 to 0.73]) or specificity (0.73 [CI, 0.62 to 0.82] vs. 0.76 [CI, 0.66 to 0.84]) (17, 24, 45, 52, 68, 72, 74). Findings were similar when 1 study with high risk of bias was excluded (68), when the analysis was restricted to prospective studies (45, 52, 74), and when analyses were stratified by tumor stage or grade.
Appendix Table 3. Direct (Within-Study) Comparisons of Diagnostic Accuracy of Urinary Biomarkers for Diagnosis of Bladder Cancer
Three studies found that ImmunoCyt was associated with higher sensitivity than FISH (0.71 [CI, 0.54 to 0.84] vs. 0.61 [CI, 0.43 to 0.76]; difference, 0.11 [CI, 0.001 to 0.21]) but lower specificity (0.71 [CI, 0.62 to 0.79] vs. 0.79 [CI, 0.71 to 0.85]; difference, −0.08 [CI, −0.15 to −0.001]) (60, 70, 72). When data were stratified by tumor stage or grade, ImmunoCyt was associated with higher sensitivity than FISH for tumor stage Ta and T1 and low-grade cancer (differences in sensitivity ranged from 0.24 to 0.35).
Evidence on other head-to-head comparisons of urinary biomarkers was based on 1 or 2 studies, which precluded reliable conclusions about comparative test performance (Appendix Table 3).
Sixteen studies found that various urinary biomarkers plus cytologic evaluation were associated with higher sensitivity than the biomarker alone (0.81 [CI, 0.75 to 0.86] vs. 0.69 [CI, 0.61 to 0.76]; difference, 0.13 [CI, 0.08 to 0.17]), with no difference in specificity (Appendix Table 3). Results were similar in a subgroup of 8 studies of ImmunoCyt plus cytologic evaluation versus ImmunoCyt alone. We found no clear differences in sensitivity for tumor stage Ta or T1 or low-grade bladder cancer (16, 33, 35, 64, 7073).
Across urinary biomarkers, sensitivity increased with higher tumor stage (Appendix Table 4). For quantitative NMP22, qualitative BTA, and FISH, differences in sensitivity ranged from 0.23 to 0.30 between T1 and Ta tumors and from 0.10 to 0.15 between T2 or higher-stage tumors and T1 tumors. Sensitivity for carcinoma in situ was similar to or slightly lower than sensitivity for T1 tumors. For ImmunoCyt, the association between higher tumor stage and increased sensitivity was less clear. Sensitivity was 0.74 for Ta tumors (CI, 0.63 to 0.83) and increased to 0.81 for T1 and T2 or higher-stage tumors.
Appendix Table 4. Sensitivity of Urinary Biomarkers for Bladder Cancer, by Tumor Stage and Grade
Sensitivity also increased with higher tumor grade (Appendix Table 4). For quantitative NMP22, qualitative BTA, and FISH, differences in sensitivity ranged from 0.14 to 0.28 between G2 and G1 tumors and from 0.16 to 0.21 between G3 and G2 tumors. For ImmunoCyt, differences based on tumor grade were less pronounced, with sensitivity of 0.74 (CI, 0.66 to 0.80) for low-grade tumors and 0.83 (CI, 0.78 to 0.88) for high-grade tumors, for a difference of 0.10 (CI, 0.03 to 0.17).
We observed similar associations with tumor stage and grade for other urinary biomarkers (Appendix Table 4). However, estimates were based on smaller numbers of studies and were less precise, and differences were not always statistically significant.
One study that stratified data by tumor size found higher sensitivity of qualitative BTA for tumors measuring 2 to 5 cm (0.96) and larger than 5 cm (1.0) than for tumors smaller than 2 cm (0.60) (P < 0.001) (41). Another study found higher sensitivity of FISH for tumors measuring 1 to 3 cm (0.93) or larger than 3 cm (0.94) than for tumors smaller than 1 cm (0.46) (P = 0.001) (40). One study that stratified data by the number of tumors found that quantitative and qualitative NMP22 and Cxbladder were each associated with higher sensitivity for multifocal versus unifocal tumors, although differences were not statistically significant (37).
Few studies evaluated effects of patient characteristics on the diagnostic accuracy of urinary biomarkers. Diagnostic accuracy did not clearly differ according to sex, age, smoking status, or receipt of prior intravesical therapy (20, 37, 43, 50). Diagnostic accuracy of ImmunoCyt was similar in studies that specifically enrolled patients with microscopic or macroscopic hematuria (59, 66, 67) and studies that enrolled patients with various signs or symptoms of bladder cancer.
Eight studies of various urinary biomarkers did not find consistent differences in specificity according to such factors as presence of other urological cancer types, renal calculi, prostatitis, benign prostatic hypertrophy, urinary tract infection, or hematuria, although specificity was higher in some studies when other urological conditions were not present (35, 37, 52, 54, 62, 66, 67, 74).
Urinary biomarkers were associated with sensitivities for bladder cancer that ranged from 0.57 to 0.82 (Figure 2) and specificities that ranged from 0.74 to 0.88 (Figures 2 and 3 and Appendix Table 2). Strength-of-evidence ratings are shown in Table 1. Positive LRs ranged from 2.52 to 5.53, and negative LRs ranged from 0.21 to 0.48 (75). Findings were robust in sensitivity and stratified analyses. Evidence was strongest for quantitative NMP22, qualitative BTA, FISH, and ImmunoCyt (moderate SOE) and relatively sparse for other biomarkers (low SOE). Across urinary biomarkers, sensitivity was greater for higher-stage and higher-grade tumors (high SOE). For qualitative BTA, FISH, and ImmunoCyt, sensitivity was higher for initial diagnosis in persons with signs or symptoms of bladder cancer than for diagnosis of recurrence. However, accuracy did not differ by testing indication for quantitative NMP22. Studies that directly compared the accuracy of quantitative NMP22 and qualitative BTA found no differences in diagnostic accuracy (moderate SOE). ImmunoCyt was associated with higher sensitivity than FISH (difference, 0.11) but lower specificity (difference, 0.08), based on 3 studies (low SOE). Head-to-head comparisons of other urinary biomarkers were too limited to reach firm conclusions about comparative accuracy. Sensitivity increased with the use of urinary biomarkers in conjunction with cytologic evaluation versus the biomarkers alone (moderate SOE). Findings were relatively robust in sensitivity and subgroup analyses based on study quality, variability in test cutoffs, country, and other factors.
Pooled sensitivities of urinary biomarkers.
BTA = bladder tumor antigen; FISH = fluorescence in situ hybridization; NMP22 = nuclear matrix protein 22; TP = true-positive.
Pooled specificities of urinary biomarkers.
BTA = bladder tumor antigen; FISH = fluorescence in situ hybridization; NMP22 = nuclear matrix protein 22; TN = true-negative.
Table 1. Strength-of-Evidence Ratings
Our findings on diagnostic accuracy were consistent with those of prior systematic reviews (76–78). Strengths of our review include the exclusion of case–control studies, which can overestimate diagnostic accuracy (79); use of bivariate binomial models to pool data that account for the correlation between sensitivity and specificity (12); incorporation of more recently published studies; separate analyses of head-to-head comparisons (14); and evaluation of subgroups based on patient and tumor characteristics.
As detailed in the full report, no studies have evaluated effects on clinical outcomes of using urinary biomarkers (5). Therefore, decisions about their use must be made on the basis of diagnostic test performance. Table 2 shows estimated probabilities for bladder cancer after use of urinary biomarkers, based on LRs calculated from pooled sensitivities and specificities. The observed LRs generally fell within the range indicating relatively small changes in the probability that a patient does or does not have bladder cancer (75). Urinary biomarkers miss 18% to 43% of patients with bladder cancer and yield false-positive results in 12% to 26% of patients without bladder cancer. The value of urinary biomarkers and whether they are sufficiently accurate to reduce the need for cystoscopy depends on the clinician's ability to estimate the pretest probability of disease, the importance to patients and clinicians of relatively small changes in the probability of bladder cancer, and the acceptable threshold and clinical consequences of missed or delayed diagnoses and false-positive results. Obtaining urinary biomarkers in combination with cytologic evaluation would increase the sensitivity but would still result in about 10% of cases being missed.
Table 2. Posttest Probability of Bladder Cancer With Use of Different Biomarkers
Our review had limitations. As in other meta-analyses of diagnostic accuracy, substantial statistical heterogeneity was present in most pooled analyses (80–82). To address anticipated heterogeneity, we used random-effects models to pool studies; stratified studies according to the reason that testing was done; and performed additional analyses based on study design and test, patient, and tumor characteristics. We also performed separate analyses based on head-to-head comparisons of biomarkers, which tended to be associated with less heterogeneity than pooled across-study estimates. A limitation of our analysis of within-group comparisons is that we had to treat the compared groups independently because we had only aggregated data, which might have resulted in overly wide CIs. However, point estimates indicated little difference between tests. We also excluded non–English-language articles and did not search for studies published only as abstracts; focused on FDA-approved tests or laboratory-developed tests; and did not evaluate other potential uses of biomarkers, such as evaluation of cytologic atypia, determination of prognosis, guidance for treatment, or assessment of recurrence risk (83, 84).
Limitations of the evidence base include few studies for biomarkers other than quantitative NMP22, qualitative BTA, FISH, and ImmunoCyt. In addition, almost all studies on the diagnostic accuracy of biomarkers had methodological shortcomings. Few studies evaluated effects of patient characteristics, such as age, sex, race, or presence of other urological conditions, on diagnostic test performance. More research is needed to understand optimal combinations of urinary biomarkers with or without cytologic evaluation and to evaluate sequences of diagnostic tests applied in prespecified algorithms, including the effect on use of cystoscopy and clinical outcomes.
In conclusion, urinary biomarkers miss a substantial proportion of patients with bladder cancer and yield false-positive results in others. Diagnostic accuracy may be slightly higher for initial diagnosis of bladder cancer in patients with signs and symptoms than for surveillance, and accuracy is poor for low-stage and low-grade tumors. Urinary biomarkers in combination with cytologic evaluation are more accurate than biomarkers alone; research is needed to understand how the use of these biomarkers with other diagnostic tests affects the use of cystoscopy and clinical outcomes.
The In the Clinic® slide sets are owned and copyrighted by the American College of Physicians (ACP). All text, graphics, trademarks, and other intellectual property incorporated into the slide sets remain the sole and exclusive property of the ACP. The slide sets may be used only by the person who downloads or purchases them and only for the purpose of presenting them during not-for-profit educational activities. Users may incorporate the entire slide set or selected individual slides into their own teaching presentations but may not alter the content of the slides in any way or remove the ACP copyright notice. Users may make print copies for use as hand-outs for the audience the user is personally addressing but may not otherwise reproduce or distribute the slides by any means or media, including but not limited to sending them as e-mail attachments, posting them on Internet or Intranet sites, publishing them in meeting proceedings, or making them available for sale or distribution in any unauthorized form, without the express written permission of the ACP. Unauthorized use of the In the Clinic slide sets will constitute copyright infringement.
Hematology/Oncology, Nephrology, Urological Disorders.
Results provided by:
Copyright © 2016 American College of Physicians. All Rights Reserved.
Print ISSN: 0003-4819 | Online ISSN: 1539-3704
Conditions of Use
This PDF is available to Subscribers Only