Diagnosis of Gout: A Systematic Review in Support of an American College of Physicians Clinical Practice Guideline

Gout is the most common form of inflammatory arthritis (1). In its initial stages, gout is recognized by acute, intermittent episodes of synovitis presenting with joint swelling and pain (referred to as acute gouty arthritis, acute gout attacks, or acute gout flares) that may progress to chronic and persistent symptoms. Gout is the result of excess serum urate crystalizing in the body (as monosodium urate [MSU]). The time between flares is referred to as the intercritical period. Intermittent attacks may progress to more chronic symptoms as the result of either joint damage or chronic synovitis. In some persons, MSU may aggregate in intra- or extra-articular regions (for example, around tendons, in bone or bursa, or in other soft tissues) to form tophi. Monosodium urate crystals may directly stimulate the inflammasome in leukocytes, causing an acute inflammatory attack (2). Although the presence of MSU crystals in synovial fluid aspirated from the affected joint is sufficient for diagnosing gout, clinicians and researchers debate whether this approach is necessary. Classification and diagnostic algorithms that rely on other signs, symptoms, and laboratory tests without synovial fluid analysis exist. Joint aspiration may be technically difficult to perform and painful for the patient, particularly in smaller joints. Specimen handling and interpretation of synovial fluid analysis may be affected by experience and training. Moreover, the accuracy and utility of MSU assessment are affected by several factors (310). This review, conducted to support an American College of Physicians (ACP) clinical practice guideline, addresses the accuracy and safety of using clinical diagnostic or classification algorithms, dual-energy computed tomography (DECT), and ultrasonography for evaluating patients with gout symptoms compared with assessing MSU crystals in joint aspirate, which is considered the gold-standard diagnostic test. Methods We developed a protocol, followed PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) guidelines, and detailed our search and selection processes, inclusion criteria, and evidence tables in an evidence report (1113). Key Questions Key questions posed by ACP representatives were revised on the basis of input from a group of key informants, a technical expert panel, and public comments. Questions addressed in this article are as follows: What is the accuracy of clinical signs and symptoms and other diagnostic tests (such as serum urate analysis, ultrasonography, computed tomography, DECT, and plain radiography), alone or in combination, compared with synovial fluid analysis in diagnosing acute gouty arthritis in patients with no previous gout diagnosis? What are the adverse effects (including pain, infection at the aspiration site, and radiation exposure) or harms (related to false-positive, false-negative, or indeterminate results) associated with tests used to diagnose gout? Additional questions and evidence regarding the effects of practitioner type and other factors on successful joint aspiration and accuracy of interpretation are available in the full evidence report (12, 13). Data Sources and Searches We searched, without language restrictions, PubMed, EMBASE, the Cochrane Library, gray literature, and the Web of Science from inception through 29 February 2016 using the word gout combined with the terms for diagnostic methods (MSU crystal analysis, joint aspiration, DECT, ultrasound, and x-ray), clinical signs and symptoms, and outcome measures, without filters specific for the diagnostic tests, as recommended (14). Supplement Table 1 shows the search methodology. We also obtained relevant references from a search conducted for a simultaneous review on gout management, considered studies suggested by experts, searched ClinicalTrials.gov and the Web of Science for recently completed studies and unpublished or nonpeer-reviewed study findings, and contacted manufacturers of equipment and laboratory test kits used to diagnose gout for unpublished data specific to their use for gout diagnosis. Supplement. Supplemental Tables Study Selection Titles and abstracts identified by the literature searches were screened by 2 reviewers, who independently conducted a full-text review of all selections to exclude articles that reported only on the incidence or prevalence, risk factors, or treatment of gout; included persons younger than 18 years; provided no usable data (sensitivities and specificities or data that could be used to calculate them); reported the same data as another article; enrolled only participants with established gout diagnoses; or did not clearly indicate the use of a recognized diagnostic standard. If necessary, disagreements regarding inclusion at the full-text stage were reconciled with the project leader's input. We included original prospective and cross-sectional studies that assessed the accuracy (sensitivity and specificity) or safety of tests used to diagnose gout in persons with no prior definitive gout diagnosis who presented with joint inflammation, and in which the reference standard was MSU analysis or a combination of MSU analysis, American Rheumatism Association (ARA) (now the American College of Rheumatology [ACR]) criteria for gout diagnosis, and tests to confirm or rule out other causes of inflammatory arthritis. Studies that enrolled patients with asymptomatic hyperuricemia were excluded. To assess safety, we also included case reports and case series. Data Extraction and Quality Assessment Two reviewers abstracted study-level details from articles accepted for inclusion. Outcomes (sensitivity, specificity, and positive and negative predictive value [PPV and NPV]) were singly abstracted and verified by another reviewer. Risk of bias (study quality) of each included study was assessed independently by 2 reviewers using the QUADAS-2 (Revised Quality Assessment of Diagnostic Accuracy Studies) tool (15, 16). Supplement Table 2 includes the results of the quality assessment. Disagreements regarding study details were reconciled by group discussion, and those related to study quality were mediated by the project leader. Data Synthesis and Analysis We organized our narrative descriptions of evidence, which focused on study quality, settings, and findings, according to categories of tests, as well as chronologically. If several studies compared similar tests with the same reference standard, we used bivariate metaregression to pool studies (17). As a group, the reviewers assessed the overall strength of evidence (SOE) for each major comparison and outcome as high, moderate, low, or insufficient using guidance suggested by the Effective Health Care Program (18). Role of the Funding Source This topic was nominated by the ACP to the Agency for Healthcare Research and Quality (AHRQ) and funded by AHRQ. Staff at AHRQ and ACP helped to develop and refine the scope of the study and reviewed the draft report but had no role in data extraction, synthesis, or rating of evidence. Results The Figure depicts the search and selection process that identified 22 total articles addressing the accuracy (n= 21) or safety (n= 3) of various diagnostic methods. Characteristics and findings of the studies are detailed in Supplement Table 3. Figure. Literature flow diagram. Validity of Clinical Classification and Diagnostic Criteria Twelve studies examined diagnostic or classification algorithms for diagnosing gout (1930). Of these 12 studies, 11 compared the predictions from 10 clinical algorithms (described in Table 1) with assessment of synovial fluid MSU crystals in all enrolled patients (1923, 2530). Study quality was good for all but 2 studies (19, 23). All studies were conducted in academic rheumatology departments, although several purposely enrolled patients who were referred by primary care physicians. The ARA's algorithm included joint fluid culture to rule out the co-occurrence of septic arthritis (19). Sensitivities and specificities for the clinical algorithms varied (compared with MSU analysis as the reference standard) (Table 2). Table 1. Summary of Components of Clinical Algorithms for Diagnosing Gout* Table 2. Summary of Findings Rome and New York Criteria Three studies found that the Rome and New York criteriathe first sets of clinical criteria developed to classify gouthad limited sensitivity (70% to 80%) and specificity (79% to 80%) for diagnosing gout compared with identifying MSU crystals in affected joints (20, 27, 30). ARA Criteria In 1977, in a high-quality study using clinical characteristics of 706 patients seen in 38 rheumatology clinics across the United States, an ARA subcommittee developed an algorithm to classify gout for research purposes (19). The final algorithm included 12 clinical signs (not including MSU analysis or presence of tophi, which were considered diagnostic by themselves), with the presence of 6 or more of these signs required to classify a patient as having gout. The initial assessment reported sensitivities of 96%, 88%, and 74%, and specificities of 73% to 93%, 89% to 99%, and 97% to 100% for 5 or more, 6 or more, and 7 or more positive criteria, respectively. Only 47% of participants underwent MSU testing, with physician judgment serving as the reference standard for those not tested. Five subsequent tests of the ARA criteria were done in groups of patients with suspected gout, all of whom underwent MSU analysis (19, 20, 22, 27, 30). These studies, which enrolled 82 to 983 patients, reported sensitivities of 70% to 92% and specificities of 53% to 92% (Table 2). Positive predictive value ranged from 66% to 92%, and NPV from 69% to 86% (study quality was high for 2 studies and moderate for the other 2). In 1 study, among the patients with false-positive results, 50% had deposits of calcium pyrophosphate crystals (20). Two studies compared the sensitivity and specificity of the algorithm between patients with recent onset 

G out is the most common form of inflammatory arthritis (1). In its initial stages, gout is recognized by acute, intermittent episodes of synovitis presenting with joint swelling and pain (referred to as acute gouty arthritis, acute gout attacks, or acute gout flares) that may progress to chronic and persistent symptoms. Gout is the result of excess serum urate crystalizing in the body (as monosodium urate [MSU]). The time between flares is referred to as the intercritical period. Intermittent attacks may progress to more chronic symptoms as the result of either joint damage or chronic synovitis. In some persons, MSU may aggregate in intra-or extra-articular regions (for example, around tendons, in bone or bursa, or in other soft tissues) to form tophi. Monosodium urate crystals may directly stimulate the inflammasome in leukocytes, causing an acute inflammatory attack (2).
Although the presence of MSU crystals in synovial fluid aspirated from the affected joint is sufficient for diagnosing gout, clinicians and researchers debate whether this approach is necessary. Classification and diagnostic algorithms that rely on other signs, symptoms, and laboratory tests without synovial fluid analysis exist. Joint aspiration may be technically difficult to perform and painful for the patient, particularly in smaller joints. Specimen handling and interpretation of synovial fluid analysis may be affected by experience and training. Moreover, the accuracy and utility of MSU assessment are affected by several factors (3)(4)(5)(6)(7)(8)(9)(10). This review, conducted to support an American College of Physicians (ACP) clinical practice guideline, addresses the accuracy and safety of using clinical diagnostic or classification algorithms, dual-energy computed tomography (DECT), and ultrasonography for evaluating patients with gout symptoms compared with assessing MSU crystals in joint aspirate, which is considered the gold-standard diagnostic test.

Key Questions
Key questions posed by ACP representatives were revised on the basis of input from a group of key informants, a technical expert panel, and public comments. Questions addressed in this article are as follows: What is the accuracy of clinical signs and symptoms and other diagnostic tests (such as serum urate analysis, ultrasonography, computed tomography, DECT, and plain radiography), alone or in combination, compared with synovial fluid analysis in diagnosing acute gouty arthritis in patients with no previous gout diagnosis? What are the adverse effects (including pain, infection at the aspiration site, and radiation exposure) or harms (related to false-positive, false-negative, or indeterminate results) associated with tests used to diagnose gout? Additional questions and evidence regarding the effects of practitioner type and other factors on successful joint aspiration and accuracy of interpretation are available in the full evidence report (12,13).

Data Sources and Searches
We searched, without language restrictions, PubMed, EMBASE, the Cochrane Library, gray literature, and the Web of Science from inception through 29 February 2016 using the word gout combined with the terms for diagnostic methods (MSU crystal analysis, joint aspiration, DECT, ultrasound, and x-ray), clinical signs and symptoms, and outcome measures, without filters specific for the diagnostic tests, as recommended (14). Supplement Table 1 (available at www.annals.org) shows the search methodology. We also obtained relevant references from a search conducted for a simultaneous review on gout management, considered studies suggested by experts, searched ClinicalTrials .gov and the Web of Science for recently completed studies and unpublished or non-peer-reviewed study findings, and contacted manufacturers of equipment and laboratory test kits used to diagnose gout for unpublished data specific to their use for gout diagnosis.

Study Selection
Titles and abstracts identified by the literature searches were screened by 2 reviewers, who independently conducted a full-text review of all selections to exclude articles that reported only on the incidence or prevalence, risk factors, or treatment of gout; included persons younger than 18 years; provided no usable data (sensitivities and specificities or data that could be used to calculate them); reported the same data as another article; enrolled only participants with established gout diagnoses; or did not clearly indicate the use of a recognized diagnostic standard. If necessary, disagreements regarding inclusion at the full-text stage were reconciled with the project leader's input. We included original prospective and cross-sectional studies that assessed the accuracy (sensitivity and specificity) or safety of tests used to diagnose gout in persons with no prior definitive gout diagnosis who presented with joint inflammation, and in which the reference standard was MSU analysis or a combination of MSU analysis, American Rheumatism Association (ARA) (now the American College of Rheumatology [ACR]) criteria for gout diag-nosis, and tests to confirm or rule out other causes of inflammatory arthritis. Studies that enrolled patients with asymptomatic hyperuricemia were excluded. To assess safety, we also included case reports and case series.

Data Extraction and Quality Assessment
Two reviewers abstracted study-level details from articles accepted for inclusion. Outcomes (sensitivity, specificity, and positive and negative predictive value [PPV and NPV]) were singly abstracted and verified by another reviewer. Risk of bias (study quality) of each included study was assessed independently by 2 reviewers using the QUADAS-2 (Revised Quality Assessment of Diagnostic Accuracy Studies) tool (15, 16). Supplement Table 2 (available at www.annals.org) includes the results of the quality assessment. Disagreements regarding study details were reconciled by group discussion, and those related to study quality were mediated by the project leader.

Data Synthesis and Analysis
We organized our narrative descriptions of evidence, which focused on study quality, settings, and findings, according to categories of tests, as well as chronologically. If several studies compared similar tests with the same reference standard, we used bivariate metaregression to pool studies (17). As a group, the reviewers assessed the overall strength of evidence (SOE) for each major comparison and outcome as high, moderate, low, or insufficient using guidance suggested by the Effective Health Care Program (18).

Role of the Funding Source
This topic was nominated by the ACP to the Agency for Healthcare Research and Quality (AHRQ) and funded by AHRQ. Staff at AHRQ and ACP helped to develop and refine the scope of the study and reviewed the draft report but had no role in data extraction, synthesis, or rating of evidence.

RESULTS
The Figure depicts the search and selection process that identified 22 total articles addressing the accuracy (n = 21) or safety (n = 3) of various diagnostic methods. Characteristics and findings of the studies are detailed in Supplement Table 3 (available at www .annals.org).

Validity of Clinical Classification and Diagnostic Criteria
Twelve studies examined diagnostic or classification algorithms for diagnosing gout (19 -30). Of these 12 studies, 11 compared the predictions from 10 clinical algorithms (described in Table 1) with assessment of synovial fluid MSU crystals in all enrolled patients (19 -23, 25-30). Study quality was good for all but 2 studies (19,23). All studies were conducted in academic rheumatology departments, although several purposely enrolled patients who were referred by primary care physicians. The ARA's algorithm included joint fluid culture to rule out the co-occurrence of septic arthritis (19). Sensitivities and specificities for the clini-cal algorithms varied (compared with MSU analysis as the reference standard) ( Table 2).

Rome and New York Criteria
Three studies found that the Rome and New York criteria-the first sets of clinical criteria developed to classify gout-had limited sensitivity (70% to 80%) and specificity (79% to 80%) for diagnosing gout compared with identifying MSU crystals in affected joints (20,27,30).

ARA Criteria
In 1977, in a high-quality study using clinical characteristics of 706 patients seen in 38 rheumatology clinics across the United States, an ARA subcommittee developed an algorithm to classify gout for research purposes (19). The final algorithm included 12 clinical signs (not including MSU analysis or presence of tophi, which were considered diagnostic by themselves), with the presence of 6 or more of these signs required to classify a patient as having gout. The initial assessment reported sensitivities of 96%, 88%, and 74%, and specificities of 73% to 93%, 89% to 99%, and 97% to 100% for 5 or more, 6 or more, and 7 or more positive criteria, respectively. Only 47% of participants underwent MSU testing, with physician judgment serving as the reference standard for those not tested. Five subsequent tests of the ARA criteria were done in groups of patients with suspected gout, all of whom underwent MSU analysis (19,20,22,27,30). These studies, which enrolled 82 to 983 patients, reported sensitivities of 70% to 92% and specificities of 53% to 92% ( Table 2). Positive predictive value ranged from 66% to 92%, and NPV from 69% to 86% (study quality was high for 2 studies and moderate for the other 2). In 1 study, among the patients with falsepositive results, 50% had deposits of calcium pyrophosphate crystals (20).
Two studies compared the sensitivity and specificity of the algorithm between patients with recent onset of initial symptoms and those with more longstanding flares. For patients with a more recent onset of the first episode, the sensitivity and specificity † MSU crystals in joint fluid or tophus or tissue or meets 2 of the criteria. ‡ ≥4/8 of the criteria checked. § Guideline 1 states that MSU is required for a definitive diagnosis, but, in its absence, such clinical criteria as those checked can be used or characteristic imaging findings may substitute. ¶ Designed to be administered by telephone by nonphysicians to assess the prevalence of gout via patient self-report; treatment questions refer to the most prominent episode. ** Involvement of joints or bursae other than the first MTP only as polyarticular presentation. ‡ ‡ Several algorithms specified the presence of these crystals as definitive in lieu of other signs.

Janssens Diagnostic Rule
The Janssens diagnostic rule (also known as the Netherlands criteria) is a 7-item algorithm developed for use in clinical diagnosis ( Table 1). Creation of the diagnostic rule was based on 328 Dutch primary care patients with suspected gout, who constituted the population Janssens used to assess the accuracy of the ARA classification criteria (study quality, high) (21).
The algorithm, which incorporated clinical signs and symptoms used by primary care physicians, was assessed in 3 high-or moderate-high-quality studies ranging in size from 233 to 983 patients with suspected gout in the United States, the Netherlands, and Thailand (24,27,30). In the initial validation study, each of the 7 items was assigned a point value from 0.5 to 3.5, for a total of 13 possible points; the PPV of scores greater than 8 was 87%, and the NPV of scores less than 4 was 95% compared with MSU crystal assessment (24). In 2 subsequent studies using the same threshold values as the original, the sensitivities were 73% to 96% and the specificities 47% to 76%, depending on the recency of initial symptoms (27,30). Sensitivities ranged from 73% to 88% and specificities from 75% to 86% for patients with recent-onset symptoms; among patients whose symptoms began at least 2 years earlier, sensitivities were 92% to 96% and specificities 47% to 50% ( Table 2) (27,30).

Clinical Gout Diagnosis Criteria
The Clinical Gout Diagnosis (CGD) is an 8-item algorithm developed in 2010 from a subset of the ARA/ ACR classification criteria for diagnosing chronic gout in the primary care setting without reliance on joint fluid analysis. It was validated in 167 consecutive patients with arthritis who were referred by primary care physicians to 2 academic rheumatology clinics in Mexico City (study quality, low) ( Table 1) (23). Using a cutoff of 4 or more positive items and the reference standard of MSU crystal analysis, the algorithm had a sensitivity of 97% and specificity of 96% (3% of patients with rheumatoid arthritis, 7% with septic arthritis, and 3% with osteoarthritis fulfilled 4 or more of the 8 criteria, compared with 97% of those with gout). Ninety percent of patients who had a CGD score of 4 or higher scored positive for gout on the basis of the Janssens criteria. All patients with early gout and 94% of women with gout also fulfilled 4 or more criteria.
The CGD was assessed further by 2 other groups (27,30). One group, using the data set of Taylor and colleagues (27), found that the CGD had 87% sensitivity and 66% specificity in patients whose symptoms began less than 2 years earlier and 99% sensitivity and 34% specificity in those whose symptoms began more than 2 years earlier (27). In an analysis using the Thai patient data set, the CGD showed sensitivity and specificity of 97% and 68% overall, 89% and 82% in patients whose symptoms began up to 2 years earlier, and 99% and 31% in those whose symptoms began more than 2 years previously (30).

Monoarthritis of the First Metatarsophalangeal Joint
Monoarthritis of the first metatarsophalangeal joint, as the sole diagnostic criterion for gout, was evaluated in a high-quality study in 159 Dutch patients seen by  general practitioners (25). Patients were referred to an academic rheumatology clinic, where they were examined within 24 hours of referral by rheumatologists blinded to the provisional diagnosis to rule out other causes of monoarthritis, and underwent joint fluid analysis. Patients with negative MSU test results were followed for 5 years; those with new episodes of monoarthritis were reevaluated. Comparing the primary care physician's provisional diagnosis of gout with the identification of MSU crystals showed 99% sensitivity and 8% specificity (PPV, 79%; NPV, 75%).

Study for Updated Gout Classification Criteria
The SUGAR (Study for Updated Gout Classification Criteria) classification initiative was conducted to determine the discriminatory value of clinical, laboratory, and imaging characteristics for distinguishing gout from the absence of gout (as defined by synovial MSU crystals). In a high-quality study, researchers first compared the performance of the major gout diagnostic and classification clinical algorithms (namely the Rome, New York, and ARA criteria; Janssens diagnostic rule; and CGD algorithm) in early-(<2 years since first attack) and later-stage (>2 years since first attack) disease in 983 consecutive patients with recent symptoms suggesting gout, who were seen in rheumatology clinics in 16 countries (27).
Across all criteria sets, the sensitivity was lower and specificity higher in patients with disease of shorter duration ( Table 2). In patients who had symptoms for 2 years or less, the sensitivity ranged from 58% (for the New York criteria) to 88% (for the Janssens diagnostic rule), whereas in those who had symptoms for more than 2 years, it was 84% (Rome criteria) to 99% (CGD). Among patients whose symptoms were of shorter duration, the specificity ranged from 66% (CGD) to 88% (New York criteria), whereas in those with more longstanding symptoms, it was 34% (CGD) to 70% (New York criteria). Analysis of MSU crystals was the reference standard.
Researchers assessed the sensitivity and specificity of 4 algorithms that allowed presence of synovial joint MSU to be sufficient for a gout diagnosis. Among patients with recent-onset symptoms, sensitivity ranged from 93% (for the Rome criteria) to 100% (for the New York, ARA, and CGD criteria), whereas among those with more long-standing symptoms, it was 99% to 100% for those 4 criterion sets. For patients with symptoms of shorter duration, specificity ranged from 66% (CGD) to 88% (New York), whereas among those with longer-duration disease, it was 34% (CGD) to 70% (New York). Thus, all criterion sets had somewhat poor specificity for more long-standing symptoms.
The investigators then devised their own 10-item algorithm from 56 criteria selected by an appropriateness panel (2 of the items were imaging characteristics: 1 ultrasonography and 1 plain radiography) (28)  . The algorithm scores with and without the imaging items showed moderate to good sensitivity (84% and 88% with and without imaging, respectively) and moderate specificity (81% and 72% with and without imaging, respectively).

ACR/European League Against Rheumatism Gout Classification Criteria
A joint committee of the ACR and European League Against Rheumatism (EULAR) recently developed a new set of criteria primarily intended for classification (29). Based on the SUGAR criteria and further validated with a representative set of test patient profiles, this criterion set takes into account the progressive and varied nature of the disease and the likelihood that patients who do not meet a particular criterion when first classified may fulfill it later. The entry criterion that must be met is at least 1 episode of swelling, pain, or tenderness in a peripheral joint. Identification of MSU crystals in the synovial fluid is considered diagnostic for gout; however, if this criterion is not met, a hierarchical set of 7 criteria, arranged in clinical, laboratory, and imaging domains, is applied ( Table 1). Each criterion contributes points (weighted for its putative importance), ranging from Ϫ4 to 4, to the total score, which may range from 6 to 23. Some criteria, such as serum urate, may contribute more or fewer points (including negative points) in proportion to severity. A score of 8 or more is considered diagnostic for gout. With MSU applied as the reference standard, the classification criteria had a sensitivity of 92% and specificity of 89% (including clinical and imaging domains) or 85% and 78% (excluding imaging) ( Table 2) (29). Thus, imaging findings improved both the sensitivity and specificity of clinical and laboratory criteria (study quality, high).

Imaging Studies
Three studies assessed the use of DECT (33)(34)(35), and 6 evaluated the use of ultrasonography (1 study compared ultrasonography with DECT) (33, 36 -40). No studies assessed the sole use of computed tomography or plain radiography.

DECT
The DECT studies compared the predictions based on this imaging method with synovial fluid analysis for MSU crystals by using a validated clinical algorithm or some combination of the 2 reference standards. All the studies were conducted in academic rheumatology departments. Quality was high for 1 study (34) and moderate for 2 studies (33,35).
One study included 94 consecutive patients seen in an academic rheumatology clinic. Two rheumatologists interpreted the DECT images. Of the 94 patients, 31 had successful joint aspiration. Sensitivity was 100% (CI, 74% to 100%) for both readers, and specificity was 89% (CI, 67% to 99%) and 79% (CI, 54% to 94%) for the 2 readers (35). In another study, 60 consecutive patients with suspected gout were seen in a German rheumatology clinic, where they had synovial fluid aspiration and DECT. The sensitivity and specificity for di-agnosing gout were 85% and 86%, respectively (PPV, 92%; NPV, 75%). All false-negative results found were in patients with recent-onset acute gout (33). However, a study in 81 consecutive patients with suspected gout who were seen in the rheumatology department at Mayo Clinic reported 90% sensitivity (CI, 76% to 97%) and 83% specificity (CI, 68% to 93%) with MSU as the reference standard. Sensitivity was lower in patients with recent-onset acute gout; all false-positive results were in patients with advanced osteoarthritis of the knee (34).
Across the 3 studies, the sensitivity of DECT for predicting gout ranged from 85% to 100% and the specificity from 83% to 92% compared with MSU analysis and a validated clinical algorithm (33)(34)(35). Two of the studies showed that DECT is less sensitive in patients with a shorter history of flares than in those with a longer one.

Ultrasonography
Six studies evaluated the use of ultrasonography for assessing symptomatic joints by comparing the predictions based on ultrasonography signs with identification of synovial fluid MSU crystals (4 studies) or with a combination of a validated clinical algorithm and MSU in patients with suspected gout or an undefined crystalline arthropathy. All studies were conducted in academic rheumatology departments. Three studies assessed symptomatic as well as asymptomatic joints in the same patients (33,36,40). The overall quality was good for 4 studies (36,37,39,40) and moderate for 2 (33,38).

Adverse Events Associated With Gout Diagnostic Tests
Two studies assessed adverse effects associated specifically with tests used to diagnose gout (34,41). One small, single-site, good-quality study reported no adverse events associated with synovial fluid aspiration for MSU analysis or with DECT (34). An assessment of adverse events associated with synovial fluid aspiration for MSU analysis was conducted in conjunction with the SUGAR project (41). This good-quality study, conducted at 25 centers in 16 countries among 887 patients, identified 1 serious adverse event (septic arthritis 11 days after arthrocentesis; culture positive for Staphylococcus aureus; event rate, 0.1% [CI, 0% to 0.34%]) and 11 nonserious adverse events (mostly mild pain after the procedure; event rate, 1.4% [CI, 0.6% to 2.1%]). No studies reported on adverse events associated with ultrasonography or clinical examination.
A third study assessed the outcomes of misdiagnosis of acute gout in 2 academic medical centers in South Korea (42). This moderate-quality study found that missed or delayed diagnosis was associated with a longer interval between the onset of attack and joint aspiration and resulted in longer hospitalizations.

DISCUSSION
Our principal findings are that several algorithms based solely on clinical signs and symptoms showed sensitivities and specificities greater than 80% for diagnosing early-onset gout compared with the gold standard of synovial fluid aspiration and MSU crystal analysis (moderate SOE). Ultrasonography and DECT also showed some evidence of diagnostic validity on their own or in conjunction with algorithms to improve their diagnostic accuracy (low SOE). Although detection of synovial fluid MSU crystals is considered the diagnostic benchmark, questions linger about the sensitivity of MSU analysis in early gout and the marginal value of aspiration or advanced imaging methods in routine clinical practice. Aspiration of smaller joints is technically difficult and painful. Misdiagnosis of gout has been associated with the use of joint aspiration and MSU assessment as well as with use of diagnostic algorithms and imaging techniques. The recency of the first attack (more or less than 2 years before the current attack) affects the sensitivity and specificity of diagnostic tests, as longer disease duration improves the performance of both algorithms and imaging tests. Evidence from imaging studies suggests that the location of the affected site also may be a factor in misdiagnosis; for example, both ultrasonography (37) and DECT (34) were less likely to detect signs of gout-related crystal deposition in the knee than in the first metatarsophalangeal joint.
The most important issue for gout diagnosis in the primary, urgent, or emergency care setting is whether using some combination of clinical signs and symptoms to diagnose gout, without relying on synovial fluid analysis of MSU crystals or imaging, is sensitive enough to detect or confirm suspicion of gout (particularly recentonset disease). Several algorithms (including the CGD criteria and Janssens diagnostic rule) had high sensitivity in patients whose attacks were of recent onset-the patients most likely to be seen in the primary, urgent, or emergency care setting (27,30)-without reliance on MSU analysis. However, the relatively low specificity of these algorithms raises the risk for a missed diagnosis of a condition with a similar initial clinical presentation, notably septic arthritis. None of the criteria or algorithms has been tested specifically to rule out septic arthritis. Therefore, if septic arthritis is in the differential, aspiration for Gram stain and culture remains an essential part of the diagnostic evaluation.
This review had several limitations. Selection bias was likely, because few studies enrolled only patients with suspected gout but without a confirmed diagnosis, and many were conducted in referral settings. Only 1 study was designed specifically to evaluate whether a particular test or algorithm could rule out conditions that might be confused with gout or that might cooccur in a patient with gout, such as calcium pyrophosphate deposition or septic arthritis (19). A limitation specific to the algorithm studies is that their objective typically was to develop or validate algorithms to classify patients for research and clinical trials and not to develop criteria for making a diagnosis in individual patients. Specifically, the aim of classification criteria often is to allow relatively homogeneous groups of study participants with clearly defined signs and symptoms to be identified and enrolled in trials. Diagnosis involves a global judgment on the physician's part, taking all aspects of a patient's clinical presentation into account (43). Also, whereas higher specificity may be preferred for trial classification, higher sensitivity may be needed for accurate diagnosis and therefore treatment (44). Nevertheless, for diseases with clear causes and presentation, classification and diagnostic criteria may be somewhat interchangeable; whether MSU analysis of joint aspirate fulfills these requirements remains unclear.
In summary, the results of studies assessing the sensitivity and specificity of various clinical algorithms for gout diagnosis or classification suggest that these instruments show promise for providing at least a provisional gout diagnosis in patients who present with signs of early-stage disease, who are most likely to be seen in primary, urgent, and emergency care settings. Imaging methods, such as ultrasonography and DECT, also seem favorable for gout diagnosis, but primary and urgent care settings may lack access to such equipment or the expertise necessary to use these techniques.