Assessment of Thiopurine S-Methyltransferase Activity in Patients Prescribed Thiopurines: A Systematic Review

Background: The evidence for testing thiopurine S-methyl transferase (TPMT) enzymatic activity or genotype before starting therapy with thiopurine-based drugs is unclear. reducing Estimates of the sensitivity of genotyping are imprecise. Evidence confirms the known associations of leukopenia or myelotoxicity with reduced TPMT activity or variant genotype.

T hiopurine S-methyltransferase (TPMT) is an enzyme that catalyzes S-methylation and inactivation of the thiopurine-based drugs azathioprine and 6-mercaptopurine, which are commonly used as steroid-sparing agents in chronic autoimmune inflammatory conditions. These drugs are effective in inducing remission in 50% to 60% of patients with inflammatory bowel disease and permit steroid reduction or withdrawal in up to 65% of patients (1). Several studies (2,3) have highlighted the importance of TPMT in thiopurine drug metabolism, because reduced or absent TPMT activity may place patients at increased risk for drug-related toxicity. The various adverse effects in-clude myelosuppression, hepatotoxicity, pancreatitis, and flu-like symptoms. Severe myelosuppression, one of the most serious dose-dependent reactions, is believed to be caused by 6-thioguanine nucleotides, the active metabolites (Figure 1) (4).
Four percent to 11% of persons are heterozygous for a variant TPMT allele and have intermediate enzymatic activity, whereas approximately 0.3% are homozygous and have very low or absent enzymatic activity (5)(6)(7). The prevalence of variant TPMT alleles varies considerably among ethnic groups (Appendix Table 1, available at www.annals.org). The 4 most common alleles seen in white, Asian, and African persons (TPMT*2, TPMT*3A, TPMT*3B, and TPMT*3C) are found in approximately 80% to 95% of persons with decreased TPMT activity (5, 8 -12). The presence of a variant allele (or decreased TPMT activity) may increase the risk for thiopurinerelated toxicity, thus increasing the risk for an adverse drug reaction in patients with such alleles who are prescribed thiopurine-based drugs. However, TPMT status does not predict all thiopurine-related adverse events; up to 70% of patients with adverse events have normal TPMT activity, and such factors as viral infections, concomitant drug therapy, and other disturbances in the thiopurine metabolic pathway probably also have a role (13).
Various clinical guidelines suggest measuring TPMT enzymatic activity or screening for common variant TPMT alleles before initiating thiopurine therapy. The drug monograph for azathioprine approved by the U.S. Food and Drug Administration (FDA) also recommends pretesting but does not mandate it. The evidence base for this recommendation is unclear, particularly the crucial evidence that pretherapy TPMT testing decreases myelotoxicity-specific mortality (14,15). In addition, regular performance of complete blood count is recommended and routinely practiced for the duration of therapy (16). Measurement of TPMT activity may not be of additional benefit, because regular monitoring may be sufficient to identify adverse events.
The status of a patient's TPMT enzymatic activity can be assessed directly, by activity testing (phenotyping), or indirectly, by genotyping for variant alleles. No recommendation exists for which testing method should be used. However, enzymatic analysis should identify most patients at risk, with the exception of those with a recent blood transfusion (17). Whether genotyping is sufficiently sensitive for routine use in clinical practice is also unclear, because most laboratories identify only the most common variant alleles and would miss rare variants.
In light of these uncertainties, we systematically reviewed evidence addressing several key questions related to TPMT testing before initiating thiopurine therapy in chronic inflammatory disease populations. This research topic was nominated by the American Association for Clinical Chemistry and commissioned by the Agency for Healthcare Research and Quality. We examined the evidence on the sensitivity and specificity of TPMT genotyping as a replacement for assessment of TPMT activity; investigated whether previous assessment of TPMT status (by genotyping or phenotyping) compared with no pretesting to guide thiopurine therapy leads to changes in management and reduction in harms; and sought evidence of an association between TPMT status and thiopurine toxicity. This report summarizes our findings.

Context
Some experts recommend routine measurement of thiopurine S-methyltransferase (TPMT) enzymatic activity before initiating thiopurine therapy.

Contribution
This systematic review found insufficient evidence that TPMT pretesting guides appropriate prescribing or improves patient outcomes (such as by limiting toxicity) compared with routine blood count monitoring in patients receiving thiopurine therapy. Limited-quality evidence suggested imperfect and imprecise sensitivity but near-perfect specificity of genotyping for identifying patients with low and intermediate TPMT enzymatic activity.

Implication
We urgently need more high-quality evidence regarding the utility and costs of routine TPMT pretesting.

METHODS
We followed a prespecified and peer-reviewed study protocol. The full evidence report, including search strategies and a detailed list of a priori outcomes, risk for bias assessment, and detailed evidence tables, is available at www.ahrq.gov/clinic/epcindex.htm.

Data Sources and Searches
Using a peer-reviewed strategy, we searched MEDLINE (1950 to week 3 of December 2010), the Cochrane Library (fourth quarter of 2010), EMBASE (1980 to week 52 of 2010), Ovid HealthSTAR (1966 to December 2010), and BIOSIS and Genetics Abstracts (to May 2009), with no language restrictions. We also searched for unpublished studies.

Study Selection
To assess the effectiveness of TPMT testing before thiopurine therapy, at least 1 study group had to have had thiopurine dose adjustment or drug replacement guided by pretherapy TPMT genotyping or phenotyping. To determine the association between TPMT status and drug toxicity, thiopurine therapy should not have been guided by the results of TPMT testing. All study designs were included except for effectiveness of pretesting, for which we limited eligibility to experimental, cohort, and casecontrol studies. Non-English language records, editorials, reviews, commentaries, letters, and news or case reports were excluded.
One reviewer screened titles and abstracts for potential relevance, and a second reviewer verified exclusions at this level. Two independent reviewers assessed the full publication of potentially relevant studies, and discrepancies were resolved by consensus.

Data Extraction and Quality Assessment
Data were extracted by using standardized forms and subsequently verified. Patients who tested negative for any of the single-nucleotide polymorphisms were considered noncarriers (wild-type homozygous), whereas carriers were either heterozygous or homozygous for a variant allele. Homozygous carriers (those with one of the same variant alleles on each of the paired chromosomes) were not differentiated from compound heterozygous carriers (those with different variant alleles on each of the paired chromosomes); both states were considered to be homozygous. Patients with normal and high TPMT activity were also grouped together; most studies used these terms interchangeably.
Two reviewers assessed the risk for bias and rated the strength of the evidence by consensus. For studies of test performance, a modified QUADAS (Quality Assessment of Diagnostic Accuracy Studies) tool with additional enquiries about Hardy-Weinberg equilibrium was used (18). For other studies, risk for bias was evaluated by using generic items that assessed selection, performance, detection, and attrition bias, as well as confounding and potential for financial conflict of interest. Each study was given an overall risk-for-bias assessment of good (low risk), fair, or poor (high risk).
We rated the strength of the evidence for the outcomes of death, serious adverse events, myelotoxicity, and health-related quality of life across the domains of risk for bias, consistency, directness, and precision, as per published guidance (19). Other outcomes of interest were patients who required thiopurine dose reduction or a switch to nonthiopurine therapy, number of monitoring tests, infection, hospitalization, withdrawal due to adverse events, leukopenia, neutropenia, thrombocytopenia, anemia, hepatotoxicity, and pancreatitis.

Data Synthesis and Analysis
We assessed performance characteristics of genotyping compared with phenotyping and estimated test sensitivity and specificity. When meta-analysis was considered inappropriate, evidence was synthesized qualitatively. Heterogeneity in investigator categorizations of enzymatic activity was not explored, because the numerical values generated by the various methods (even by similar methods in different laboratories) are often not comparable. We used a codominant model to pool data associated with noncarrier, heterozygous carrier, and homozygous carrier states when estimating the magnitude of association between genotype and thiopurine toxicity. Similarly, 3 categories of enzymatic activities were defined (high or normal, intermediate, and low or absent). In separate analyses, we compared toxicity rates in each genotypic state and TPMT category with the others.
Our primary analyses of genotype-toxicity association pooled studies that tested at least TPMT*2, TPMT*3A, TPMT*3B, and TPMT*3C, regardless of whether they tested additional variants. In contrast, for primary analyses of test performance, we pooled studies that genotyped identical sets of TPMT mutations. This different metaanalytic approach between the genotype-toxicity association and the test performance studies was based on our understanding that estimates of strength of association between TPMT genotype or phenotype and thiopurine drug toxicity are largely indirect and hypothesis-generating, whereas estimates of sensitivity and specificity of TPMT genotyping could directly affect management decisions involving thiopurine dosing. This necessitated estimating test performance specifically for the particular set of alleles genotyped. For both the performance of genotyping and the genotypic association with thiopurine toxicity, we considered additional meta-analyses by pooling studies as long as they genotyped for all ethnicity-specific mutations with a known prevalence greater than 1%.
When appropriate, sensitivity and specificity estimates were pooled by first transforming proportions into the Freeman-Tukey variant of the arcsine square root transformed proportion (20). The pooled proportion was calculated as the back-transformation of the weighted mean of the transformed proportions. Data were pooled by using a fixed-effects inverse variance weighted average. Odds ratios (ORs) were pooled by using the fixed-effects Mantel-Haenszel method without continuity correction (21).
Pooled estimates of sensitivity and ORs and their 95% CIs were calculated by using StatsDirect, version 2.7.8 (StatsDirect, Cheshire, United Kingdom). We tested for statistical heterogeneity (but not for sensitivity and specificity meta-analyses) by using the Cochran Q test, to be reported when substantial (P for chi-square test of heterogeneity Ͻ0.10 and I 2 Ն50%).

Role of the Funding Source
The Agency for Healthcare Research and Quality supported this study but had no role in formulating study questions, conducting the systematic review, or approving the manuscript for submission and publication.

RESULTS
We screened 1890 records and included 118 unique studies in the full report (Figure 2). Of these, 55 (22-   76) addressed the 3 key questions. Most studies (Ͼ75%) were rated as fair, whereas a substantial proportion of the studies of test performance (37%) were of poor design. Three studies were restricted to pediatric patients (38,58,67).

Test Performance of TPMT Genotyping
Nineteen studies, mostly of cross-sectional and prospective observational design, contributed evidence. Most of these were not designed to assess test performance but provided parallel genotypic and phenotypic data, a limitation that was captured in the quality assessment of the studies. Approximately 70% of studies included patients with inflammatory bowel disease. Among 1735 total patients, 184 were heterozygous and 16 were homozygous for variant alleles. In most studies, all or most study patients were white, except for 2 studies (43, 51) that were restricted to Japanese and South Asian patients. In the 11 studies that reported concomitant medications, 5-aminosalicylic acid and steroids were commonly used. One study (45) reported recent transfusion as an exclusion criterion. A common study limitation was lack of clarity about whether the determination of either enzymatic activity level or patient genotype was influenced by previous knowledge of the other value.
The sensitivity of the carrier genotype (heterozygous or homozygous) for correctly identifying patients with subnormal (intermediate or low) enzymatic activity was imprecise, ranging from a pooled estimate of 70.33% to 86.15% (lower-bound 95% CI, 54.52% to 70.88%; upper-bound CI, 78.50% to 96.33%) across the subgroups of alleles tested (Figure 3). Small, single studies of poor to fair quality provided limited evidence for 3 of the subgroups. Meta-analysis of the 19 studies that genotyped all ethnicity-specific mutations with a known prevalence greater than 1% yielded a pooled sensitivity of 79.90% (CI, 74.81% to 84.55%).
The sensitivity of homozygous genotype for correctly identifying patients with low enzymatic activity is based on sparse data, which precluded meta-analysis. Twenty patients with low to absent enzymatic activity were reported across 8 of 19 studies, all of which demonstrated imprecise estimates (Appendix Figure 1, available at www.annals .org). Of the 20 patients, 16 (reported in 6 studies) were homozygotes. Only 4 of 1715 heterozygotes or noncarriers across all 19 studies (0.23%) had low to absent enzymatic activity.
The specificity of noncarrier (or wild-type) genotype for correctly identifying patients with normal or high activity, and that of noncarrier or heterozygous status for identifying those with intermediate to high activity, approached 100% across all subsets of alleles genotyped. Data were insufficient to determine the optimum combination of TPMT alleles for testing.

Patient Management and Harm Reduction Guided by TPMT Status
One randomized, controlled trial and 1 poor-quality retrospective cohort study (34,75) contributed evidence. The randomized trial assigned patients with chronic inflammatory conditions to one group that received TPMT genotyping before they received azathioprine and another that was genotyped after 4 months of azathioprine therapy. Both groups received routine monitoring and other management, and dosing was left to the physicians' discretion. In total, 298 were noncarriers, 34 were heterozygotes, and 1 was a homozygote. The trial was terminated early because of recruitment problems; physicians were not prepared to randomly assign patients to receive azathioprine without previous genotyping. The trial was therefore underpowered to detect clinical events and drug dosing (Appendix Table 2, available at www.annals.org). Because of the small number of events, no major differences were noted in the outcomes of neutropenia and pancreatitis, whereas significantly higher odds were observed for hepatitis in the group randomly assigned to receive TPMT genotyping before therapy (OR, 2.54 [CI, 1.08 to 5.97]; 9 of 167 patients vs. 8 of 166 patients) (34,75). The observational study (34) was also underpowered to demonstrate significant differences for leukopenia and hepatotoxicity.

Association Between TPMT Status and Thiopurine Toxicity Enzymatic Activity
Of the 17 eligible studies, 24% (most of which had observational designs) were rated as poor and the rest were judged as fair. Comparability of prognostic factors across groups, double-blinded outcomes assessment, and genotype or phenotype determination could not be clearly established in more than 90% of studies. were of fair quality and included patients with inflammatory bowel disease. Comparability of prognostic factors across groups, double-blinded outcomes assessment, and genotype or phenotype determination could not be clearly established in more than 85% of studies. Compared with noncarriers, heterozygotes had pooled odds of 4.29 (CI, 2.67 to 6.89) for leukopenia (Appendix Figure 2, available at www.annals.org). Meta-analysis of 5 studies that compared 7 homozygotes with 475 noncarriers demonstrated greater but very imprecise odds of leukopenia (OR, 20.84 [CI, 3.42 to 126.89]). For all other outcomes, evidence either was absent or lacked the power to demonstrate significant differences between heterozygous and homozygous carriers compared with noncarriers or between themselves. Broadening the meta-analyses to studies that genotyped all ethnicity-specific mutations with a known prevalence greater than 1% did not improve the precision of the estimates. Of note, withdrawals due to adverse events were

DISCUSSION
To our knowledge, ours is the first comprehensive systematic review to investigate the ability of TPMT genotyping to correctly identify TPMT activity status, and the utility of determining TPMT status of patients by genotyping or phenotyping before initiating thiopurine therapy. We also investigated indirect evidence linking TPMT status with thiopurine toxicity.
Little good-quality primary research addresses these questions. Limited-quality evidence indicates that the estimates of genotyping sensitivity are imprecise, despite nearperfect specificity, for identifying subnormal enzymatic activities. Evidence is currently insufficient to address the utility of TPMT testing before initiating thiopurine therapy compared with routine blood count monitoring. Whether pretesting guides appropriate prescribing is also unclear. Indirect evidence confirms previously known strong associations between thiopurine-related leukopenia and either low levels of TPMT enzymatic activity or the presence of TPMT allelic polymorphisms (77). This was reflected in significant associations between low levels of enzymatic activity and myelotoxicity.
High concordance between TPMT genotype and enzymatic activity (phenotype) has been reported in healthy populations, leading to frequent replacement of TPMT phenotyping with genotyping (78). Because TPMT enzymatic activity (absent or low, intermediate, or normal or high) actually defines the TPMT status that guides thiopurine dosing, comparing the test performance of genotyping with phenotyping in diseased populations was considered important. Estimates of the sensitivity of genotyping to identify either low or low-to-intermediate enzymatic activity were generally imprecise and lower than specificity estimates (which approached 100%). When we broadened the meta-analysis to pool sensitivity data across the 19 studies that genotyped all ethnicity-specific mutations with a known prevalence greater than 1%, the pooled sensitivity of the carrier genotype (heterozygous or homozygous) to correctly identify patients with subnormal (intermediate or low) enzymatic activity varied (CI, 75% to 85%). This range was derived by using a fixed-effects meta-analytic model that does not account for between-study heterogeneity. Thus, the range may be considered no more than a signal or possible approximation of the true sensitivity, which could not be precisely established from the available literature.
More than one third of the body of evidence from which these sensitivity estimates originate is of poor quality. Although some studies in healthy populations have reported very high concordance between phenotyping and TPMT genotyping, others have demonstrated less-perfect re-sults (78,79). Nevertheless, medical test performance should be evaluated in an appropriate spectrum of patientsthose who will probably have medical testing (18). Thus, concordance estimates in healthy populations are not applicable to patients with chronic inflammatory diseases, who systematically differ from healthy populations in demographic features, comorbid conditions, and concomitant medications.
Ford and colleagues (80) analyzed 14 832 patient samples by TPMT phenotyping and 1769 by genotyping over 1 year during routine testing. The monthly mean concordance between low TPMT activity and a variant heterozygous genotype ranged from 67% to 90%; however, genotyping correctly identified 41 of 44 patients with deficient TPMT activity. This is expected, because TPMT genotyping generally targets only the common polymorphisms and would not identify previously unidentified or rare mutations. In addition, the commonly used genotype tests, although able to identify specific polymorphisms, cannot determine the allelic location. For example, a patient typed as heterozygous for TPMT*3A (wild-type/*3A) may have been misdiagnosed as such and could actually be a compound heterozygote carrying TPMT*3B/*3C, with corresponding low-to-absent TPMT activity (8,82).
Our finding that TPMT testing before initiating thiopurine therapy is of indeterminate utility seems at odds with previously published economic evaluations that recommend such testing. However, those evaluations have been criticized for incorporating clinical data from retrospective studies and expert opinion instead of prospective empirical evidence; the latter, as our review shows, is lacking (83). Approximately 5% to 15% of patients are heterozygous, whereas approximately 0.3% are homozygous (5,6,8). Available evidence of associations with thiopurine toxicity was limited by few heterozygotes (or those with intermediate activity), occasional homozygotes (or those with low or absent activity), and low event rates among the study populations. The findings therefore lacked power to rule in or rule out significant associations between TPMT status and most outcomes of thiopurine toxicity. Our review was restricted to English-language literature; however, it is unclear how much this restriction might have contributed to the observed scarcity of evidence.
Higgs and colleagues' recent systematic review (79) aimed to quantify the associations between leukopenia and intermediate TPMT activity or heterozygous genotype, compared with normal activity or noncarrier genotype. The authors pooled both the TPMT activity and genotype data. No disease restrictions were used; thus, patient populations were broadened to include recipients of transplanted organs and patients with cancer. The pooled odds ratio for leukopenia was 4.19 (CI, 3.20 to 5.48), almost identical to our meta-analytic estimate of 4.29 (CI, 2.67 to 6.89) when heterozygotes were compared with noncarriers.
Higgs and colleagues wisely questioned the importance of modest decreases in leukocyte counts and argued that modest leukopenia may reflect effective treatment with thiopurine-based drugs rather than the undesired adverse event of myelosuppression.
Various recent guidelines, as well as the FDAapproved product monograph for azathioprine, have advocated determining TPMT status before initiating treatment with thiopurines (14,84). The proposition that knowledge of TPMT status before therapy would lead to decreased rates of dose-dependent toxicity is rational and based on evidence of strong genotypic and phenotypic associations in observational studies. However, from an evidence-based perspective, guideline recommendations of pretreatment TPMT testing are premature for several reasons. First, the direct evidence base for these recommendations is lacking, especially the crucial evidence that TPMT testing before thiopurine therapy decreases myelotoxicity-specific mortality. Second, patients who receive thiopurine-based drugs must have complete blood counts measured on a regular basis to prevent severe myelotoxicity by early detection. Third, azathioprine and 6-mercaptopurine had been used successfully for several years before TPMT testing was available, and present management (testing or no testing before therapy) varies across clinical specialties. Fourth, thiopurine-related toxicities are also partially explained by mutations in other enzymes, drug interactions, concurrent infections, and immune-mediated drug reactions. Fifth, the extremely low prevalence of homozygotes means that available studies are severely underpowered to provide direct evidence of the effectiveness of pretesting in the subpopulation believed to be most at risk (5,6,8). Finally, the use of TPMT status to guide treatment has the potential to reduce the efficacy of thiopurine drugs if physicians are overzealous in reducing thiopurine dosages. The 2004 guidelines from the British Society of Gastroenterology (85)