Meta-analysis: New Tests for the Diagnosis of Latent Tuberculosis Infection: Areas of Uncertainty and Recommendations for Research

Dick Menzies, MD, MSc; Madhukar Pai, MD, PhD; and George Comstock, MD, DrPH
[+] Article and Author Information

From McGill University, Montréal, Québec, Canada, and Johns Hopkins University, Baltimore, Maryland.

Acknowledgments: The authors thank Drs. Janice Pogoda, Peter Barnes, Philip Hill, Thomas Meier, Peter Andersen, and Delia Goletti for providing additional information.

Grant Support: None.

Potential Financial Conflicts of Interest: None disclosed.

Requests for Single Reprints: Dick Menzies, MD, MSc, Respiratory Epidemiology and Clinical Research Unit, Montréal Chest Institute, 3650 St-Urbain, Room K1.24, Montréal, Québec H2X 2P4, Canada; e-mail, dick.menzies@mcgill.ca.

Current Author Addresses: Dr. Menzies: Respiratory Epidemiology and Clinical Research Unit, Montréal Chest Institute, 3650 St-Urbain, Room K1.24, Montréal, Québec H2X 2P4, Canada.

Dr. Pai: Deptartment of Epidemiology, Biostatistics & Occupational Health, McGill University, 1020 Pine Avenue West, Montréal, Québec H3A 1A2, Canada.

Dr. Comstock: Johns Hopkins Training Center for Public Health Research, PO Box 2067, 1302 Pennsylvania Avenue, Hagerstown, MD 21742-3230.

Author Contributions: Conception and design: D. Menzies, M. Pai.

Analysis and interpretation of the data: D. Menzies, M. Pai, G. Comstock.

Drafting of the article: D. Menzies.

Critical revision of the article for important intellectual content: M. Pai, G. Comstock.

Final approval of the article: M. Pai, G. Comstock.

Statistical expertise: D. Menzies, M. Pai.

Collection and assembly of data: D. Menzies, M. Pai.

Ann Intern Med. 2007;146(5):340-354. doi:10.7326/0003-4819-146-5-200703060-00006
Appendix Figure.
Summary of literature search and study selection.*

* TB = tuberculosis.

Figure 1.
Forest plot of studies estimating sensitivity of the 3 tests in patients with active tuberculosis as a surrogate for latent tuberculosis infection.

Point estimates for sensitivity and 95% CIs are shown. QFT = QuantiFERON; TST = tuberculin skin test.

Figure 2.
Forest plot of studies estimating specificity of the 3 tests in populations at very low risk for latent tuberculosis infection.

Point estimates for specificity and 95% CIs are shown. BCG = bacille Calmette–Guérin; QFT = QuantiFERON; TST = tuberculin skin test.

Submit a Comment
QuantiFERON-TB Gold or Tuberculin Skin Test
Posted on March 19, 2007
Jerome A Boscia
Centocor Research and Development Inc
Conflict of Interest: None Declared

In their interesting meta-analysis, Menzies and colleagues state in their introduction "The U.S. Centers for Disease Control and Prevention [CDC] has recommended that QFT [QuantiFERON-TB Gold] replace the tuberculin skin test [TST]..." Please note that the actual publication that Menzies and colleagues reference (1) states in its summary "...CDC recommends that QFT-G [QuantiFERON-TB Gold] may be used in all circumstances in which the TST is currently used..." and in its text "QFT- G can be used in all circumstances in which the TST is used..." and "QFT-G usually can be used in place of (and not in addition to) the TST." The CDC clearly uses the words "...may..," "...can..." and "...usually can..." but does not "...recommend that QFT replace the TST..."

1. Mazurek GH, Jereb J, Lobue P, Iademarco MF, Metchock B, Vernon A. Guidelines for using the QuantiFERON-TB Gold test for detecting Mycobacterium tuberculosis infection, United States. MMWR Recomm Rep. 2005;54:49-55.

Conflict of Interest:

Dr. Boscia is Senior Vice President for Clinical Research and Development for Centocor Research and Development Inc. He reports holding stock and stock options in Johnson & Johnson, the parent company of Centocor Research and Development Inc.

Recommendations for research should include avoidance of sensitivity and specificity estimation
Posted on March 23, 2007
Dr. Heinke Kunst
Consultant & Honorary Senior Lecturer, Birmingham Heartlands Hospital & University of Birmingham, UK
Conflict of Interest: None Declared

We read the recent meta-analysis of new tests for latent tuberculosis (TB) infection (1) with interest and generally concurred with authors' inferences for research and practice. As the target condition in this review is without a gold standard, we were dissatisfied with the use of sensitivity and specificity - measures of test accuracy that cannot be computed without verification of disease status - in data synthesis. We feel it is important that researchers establish the evidence base for incorporation of new tests in clinical practice concerning latent TB infection using sound methods for evaluation of tests without gold standard (2) and take this opportunity to highlight some key issues important for such research.

Half a century ago the statistics sensitivity and specificity were introduced to express the diagnostic accuracy (3;4) in studies where findings of medical tests were compared between patients known to have the disease of interest and subjects without disease using a 2x2 table cross classifying index test and gold standard results. Nowadays these measures of test accuracy are so ubiquitous that researchers often feel compelled to estimate them even when the disease status of study subjects cannot be easily verified. In this situation, estimates of diagnostic accuracy are biased due to the poverty of standards used for verification of disease.(2) In latent TB infection there is no readily applicable gold standard so it is impossible to ascertain sensitivity and specificity directly. Menzies et al (1) used surrogate gold standards taking active TB cases to estimate sensitivity and low-risk subjects to estimate specificity. This approach has many problems, some of which are acknowledged in the review. As data for the two accuracy measures are derived from different sources, it is not possible to study the relationship between sensitivity and specificity which is a critical step in evaluating threshold effect and making decisions about pooling data in meta-analysis.(5) By definition latent TB is a condition among otherwise healthy subjects, so accuracy estimates derived from these data are likely to misrepresent the true performance of the index tests in latent TB. Mixing up different conditions into the dataset introduces spectrum bias.(6) Thus neither sensitivity in active TB nor specificity in apparently healthy low risk subjects without verification of absence of disease can be generalized for application in practice concerning latent TB infection.

We would like to emphasize that an approach to index test validation that goes beyond the diagnostic accuracy paradigm (2) is needed for an alternative evaluation process in the absence of a gold standard. A recently published health technology assessment report provides such an approach to evaluate tests for latent TB infection.(7) As there is no gold standard for latent TB infection against which the comparative accuracy of the tuberculin skin test (TST) and interferon gamma assays (IGRA) could be assessed, the validation process should explore meaningful relationships between index test results and other tests results and clinical characteristics. For example, the risk of latent TB infection is greatest among those TB contacts who share a room with the index case for the greatest length of time and a validation study can evaluate if results of tests for latent TB infection correlate with level of exposure in an outbreak investigation that ascertains exposure to TB, performs various index tests of interest in eligible subjects, and compares test results with exposure status. Further validation of the role of new tests vs TST may be obtained through studies comparing the likelihood of false results in BCG vaccination and HIV infection. Yet other validation may come from observational studies evaluating whether new test results are predictive of development of active tuberculosis in the future. Menzies et al (1) do touch on these aspects but their meta-analysis focuses on sensitivity and specificity.

The results from validation studies can be systematically reviewed and meaningful inferences drawn about the value of tests without the need to use sensitivity and specificity. The apparent attraction of these two simple statistics makes it hard for us to give them up but when a gold standard does not exist we have no choice but to abandon them. This is one key recommendation that diagnostic research in latent TB infection need to consider.

Heinke Kunst, Consultant and Honorary Senior Lecturer, Department of Respiratory Medicine, Birmingham Heartlands Hospital and University of Birmingham, UK.

Khalid S. Khan, Professor of Obstetrics-Gynaecology and Clinical Epidemiology, University of Birmingham, UK


1. Menzies D, Pai M, Comstock G. Meta-analysis: New Tests for the Diagnosis of Latent Tuberculosis Infection: Areas of Uncertainty and Recommendations for Research. Ann Intern Med 2007;146:340-54.

2. Rutjes AS, Riestma JB, Coomarasamy A, Khan KS, Bossuyt P. Evaluation of diagnostic tests when there is no gold standard - a review of methods. University of Birmingham: NHS Research Methodology Programme, 2006. (http://www.pcpoh.bham.ac.uk/publichealth/nccrm/PDFs%20and%20documents/Publications/JH21KK_final_report_July2006.pdf)

3. Ledley RS,.Lusted LB. Reasoning foundations of medical diagnosis; symbolic logic, probability, and value theory aid our understanding of how physicians reason. Science 1959;130:9-21.

4. Yerushalmy J. Statistical problems in assessing methods of medical diagnosis, with special reference to X-ray techniques. Public Health Rep 1947;62:1432-49.

5. Zamora J, Abraira V, Muriel A, Khan K, Coomarasamy A. Meta-DiSc: a software for meta-analysis of test accuracy data. BMC Med Res Methodol 2006;6:31.

6. Ransohoff DF,.Feinstein AR. Problems of spectrum and bias in evaluating the efficacy of diagnostic tests. N Engl J Med 1978;299:926-30.

7. Dinnes J, Deeks J, Kunst H, Gibson A, Cummins E, Waugh N et al. A systematic review of rapid diagnostic tests for the detection of tuberculosis infection. Health Technol Assess 2007;11:1-196.

Conflict of Interest:

None declared

Interferon-gamma release assays for latent tuberculosis infection: when should they be preferred?
Posted on May 3, 2007
Sergio Carbonara
Clinic of Infectious Diseases, University of Bari, Italy
Conflict of Interest: None Declared

The excellent meta-analysis by Menzies concerning the advantages and limits of interferon-γ release assays (IGRAs) versus standard tuberculin skin testing (TST) for detecting latent tuberculosis infection (LTBI) (1), raises several questions regarding the clinical settings in which IGRAs may or should be preferred to TST. In fact, some CDC statements concerning the use of IGRA "Quantiferon-TB Gold" (QFT-G, Cellestis Limited, Australia), such as "QFT-G usually can be used in place of (and not in addition to) the TST" (2) or "a negative result suggests that infection is unlikely" (3), now appear rather simplistic and should be reconsidered.

While acknowledging the greater specificity of IGRAs (significant only for populations BCG-vaccinated <10 years prior to TST), the meta-analysis highlights the frequent and partly unexplained discordant IGRAs/TST results, leading us to question the indifferent or advantageous use of IGRAs rather than TST. In fact, while IGRAs+/TST- discordance is uncommon (5.1% of all QFT/TST tests, range 1-11%), TST+/IGRA- discordance is frequent (24%, range 2-86%) (1). The authors state that, although the latter may partly "be explained by superior specificity of IGRAs, it would be overly simplistic to assume that the results of IGRA tests were always correct and that those of TST were always incorrect". This observation is supported by the spontaneous reversion of QFT from positive to negative results (9-28% of cases) (1). Furthermore, a recent study in a large BCG- unvaccinated population (4) found IGRAs less sensitive; TST but not IGRA reactivity was associated with age while only IGRAs were correlated with recent TB exposure, thus suggesting a lower IGRAs sensitivity for older TB infections due to a decline of their reactivity over time.

Therefore, discussion of updated guidelines for a rational use of these assays in different settings is warranted. For instance, IGRAs could be considered as first-choice for contact investigations, when the major aim is to distinguish recently infected cases (at highest risk for active TB in the short term), or in populations with high BCG-vaccination coverage in the previous 10 years. Conversely, TST might still represent the preferred test for routine screening in settings with moderate to high TB risk (5), where a high sensitivity for older TB infections is also required; in these circumstances, IGRAs could be indicated for selected TST-positive subjects, such as those in whom the risk/benefit ratio of LTBI treatment is doubtful (e.g. unclear or close to cutoff TST reaction, recent BCG-vaccination, risks of toxicity, or expected low adherence).

Sergio Carbonara (a), Benedetta Longo (b), Sergio Babudieri (c),Giulio Starnini (d), Roberto Monarca (d), Massimo Andreoni (e), Giovanni Rezza (b)

(a) Clinica Malattie Infettive - Università degli Studi di Bari, Italy (b) Dipartimento Malattie Infettive - Istituto Superiore di Sanità "“ Roma, Italy (c) Istituto Malattie Infettive - Università degli Studi di Sassari, Italy (d) Divisione Malattie Infettive, Ospedale "Belcolle", Viterbo, Italy (e) Cattedra di Malattie Infettive - Dipartimento di Sanità Pubblica "“ Università degli Studi di Roma "Tor Vergata", Italy


1. Menzies D, Pai M, Comstock G. Meta-analysis: new tests for the diagnosis of latent tuberculosis infection: areas of uncertainty and recommendations for research. Ann Intern Med. 2007 Mar 6;146(5):340-54. Erratum in: Ann Intern Med. 2007 May 1;146(9):688.

2. Mazurek GH, Jereb J, Lobue P, Iademarco MF, Metchock B, Vernon A; Division of Tuberculosis Elimination, National Center for HIV, STD, and TB Prevention, Centers for Disease Control and Prevention (CDC). Guidelines for using the QuantiFERON-TB Gold test for detecting Mycobacterium tuberculosis infection, United States. MMWR Recomm Rep. 2005 Dec 16;54(RR- 15):49-55.

3. Centers for Disease Control and Prevention (CDC). QuantiFERON®-TB Gold Test. 2006. Available online at the website: http://www.cdc.gov/tb/pubs/tbfactsheets/QFT.pdf

4. Arend SM, Thijsen SF, Leyten EM, Bouwman JJ, Franken WP, Koster BF, Cobelens FG, van Houte AJ, Bossink AW. Comparison of two interferon- gamma assays and tuberculin skin test for tracing tuberculosis contacts. Am J Respir Crit Care Med. 2007 Mar 15;175(6):618-27.

5. Carbonara S, Babudieri S, Longo B, Starnini G, Monarca R, Brunetti B, Andreoni M, Pastore G, De Marco V, Rezza G; GLIP (Gruppo di Lavoro Infettivologi Penitenziari). Correlates of Mycobacterium tuberculosis infection in a prison population. Eur Respir J. 2005 Jun;25(6):1070-6.

Conflict of Interest:

None declared

In Response
Posted on May 11, 2007
Dick Menzies
Montreal Chest Institute, McGill University
Conflict of Interest: None Declared

May 10, 2007

To the Editor, Annals of Internal Medicine,

In response to Dr's. Kunst and Khan, we agree that the absence of a proper gold standard is a fundamental problem of all cross sectional studies of diagnostic tests for latent TB infection. This problem is explicitly stated on several occasions in our paper (1). We recommended that longitudinal studies following cohorts of persons with positive or negative test results will be most valuable, as the later development of active TB is the only certain indicator of the presence of latent TB infection. Since treatment reduces incidence of disease, ideally, such cohorts of individuals would be untreated. But this would pose serious ethical issues. However, as we have pointed out elsewhere (2), individuals with discordant test results could be left untreated, as there is equipoise regarding their management, and prognosis of discordant results is the most critical issue for understanding the predictive value of interferon-g release assays (IGRA). We are aware of several prospective studies being conducted in different settings at the present time (3;4); we await these results with interest.

In regard to gradients of exposure, we did review all available studies (See Table 3 (1)). However, the methods of these studies in terms of measurement and categorization of exposure, and disease in the source cases, were too heterogeneous to allow their integration for a proper meta -analysis.

Systematic reviews and meta-analysis must take advantage of published literature to be informative. Hence, we assessed sensitivity and specificity, but pointed out clearly in the paper that these were surrogate measures with significant limitations. If we had only included published studies with the correct gold standard for LTBI (as above), there would have been no papers on which to base our estimates. There is a considerable amount of published literature currently available, which has compared two or even all three of the cur rently available tests for LTBI. Both IGRA reviewed are currently licensed in many countries and actively marketed in North America and Europe. Hence, their relative performance in different patient populations and clinical situations is of considerable interest. We feel strongly that to ignore this large body of information, because of certain limitations, would do a disservice to public health and clinical practitioners who are faced with making choices, and managing patients now.

Yours very truly,

Dr. Dick Menzies, Dr. Madhukar Pai Montreal Chest Institute, McGill University

Reference List

(1) Menzies D, Pai M, Comstock GW. New tests for diagnosis of latent tuebrculosis infection - areas of uncertainty and recommendations for research. Ann Intern Med. 2007;146:340-354.

(2) Pai M, Menzies D. The New IGRA and the Old TST. American Journal of Respiratory and Critical Care Medicine. 2007;175:529-30.

(3) Anderson P, Doherty TM, Pai M, Weldingh K. The prognosis of latent tuberculosis: can disease be predicted? Trends Microbiol. 2007;13:175-82.

(4) Pai M, Dheda K, Cunningham J, Scano F, O'Brien R. T Cell Assays for the Diagnosis of Latent Tuberculosis Infection: Moving the Research Agenda Forward. Lancet Infect Dis. Published online. April 17, 2007.

Conflict of Interest:

None declared

