Nonpharmacologic Therapies for Acute and Chronic Low Back Pain: A Review of the Evidence for an American Pain Society/American College of Physicians Clinical Practice Guideline

Many nonpharmacologic therapies are available for treatment of low back pain. In 1 study of primary care clinicians, 65% reported recommending massage therapy; 55% recommended therapeutic ultrasonography; and 22% recommended, prescribed, or performed spinal manipulation (1). In another study, 38% of patients with spine disorders were referred to a physical therapist for exercise therapy, physical therapies, or other interventions (2). Other noninvasive interventions are also available, including psychological therapies, back schools, yoga, and interdisciplinary therapy. Clinicians managing low back pain vary substantially in the noninvasive therapies they recommend (3). Although earlier reviews found little evidence demonstrating efficacy of most noninvasive therapies for low back pain (46), many more randomized trials are now available. This article summarizes current evidence on noninvasive therapies for low back pain in adults. It is part of a larger evidence review commissioned by the American Pain Society and the American College of Physicians to guide recommendations for management of low back pain (7). Pharmacologic therapies are reviewed in a separate article in this issue (8). Methods Data Sources and Searches An expert panel convened by the American Pain Society and American College of Physicians determined which nonpharmacologic therapies would be included in this review. Appendix Table 1 shows the 17 therapies chosen by the panel and how we defined and grouped them. Several therapies that have not been studied in the United States or are not widely available (such as acupressure, neuroreflexotherapy, spa therapy, and percutaneous electrical nerve stimulation) are reviewed in the complete evidence review (7). Therapies solely involving advice or back education are also reviewed separately, as are surgical and interventional pain procedures. Appendix Table 1. Included Interventions We searched MEDLINE (1966 through November 2006) and the Cochrane Database of Systematic Reviews (2006, Issue 4) for relevant systematic reviews, combining terms for low back pain with a search strategy for identifying systematic reviews. When higher-quality systematic reviews were not available for a particular intervention, we conducted additional searches for primary studies (combining terms for low back pain with the therapy of interest) on MEDLINE, the Cochrane Central Register of Controlled Trials, and PEDro. Full details of the search strategies are available in the complete evidence report (7). Electronic searches were supplemented by reference lists and additional citations suggested by experts. We did not include trials published only as conference abstracts. Evidence Selection We included all randomized, controlled trials meeting all of the following criteria: 1) reported in English, or in a non-English language but included in an English-language systematic review; 2) evaluated nonpregnant adults (>18 years of age) with low back pain (alone or with leg pain) of any duration; 3) evaluated a target therapy; and 4) reported at least 1 of the following outcomes: back-specific function, generic health status, pain, work disability, or patient satisfaction (9, 10). We excluded trials of low back pain associated with acute major trauma, cancer, infection, the cauda equina syndrome, fibromyalgia, and osteoporosis or vertebral compression fracture. Because of the large number of studies on therapies for low back pain, our primary source for trials was systematic reviews. When multiple systematic reviews were available for a target therapy, we excluded outdated systematic reviews, which we defined as systematic reviews with a published update or those published before 2000. When a higher-quality systematic review was not available for a particular therapy, we included all relevant randomized, controlled trials. We also supplemented systematic reviews with data from recent, large (>250 patients) trials. Data Extraction and Quality Assessment For each included systematic review, we abstracted information on search methods; inclusion criteria; methods for rating study quality; characteristics of included studies; methods for synthesizing data; and results, including the number and quality of trials for each comparison and outcome in patients with acute (<4 weeks' duration) low back pain, chronic/subacute (>4 weeks' duration) low back pain, and back pain with sciatica. If specific data on duration of trials were not provided, we relied on the categorization (acute or chronic/subacute) assigned by the systematic review. For each trial not included in a systematic review, we abstracted information on study design, participant characteristics, interventions, and results. We considered mean improvements of 5 to 10 points on a 100-point visual analogue pain scale (or equivalent) to be small or slight; 10 to 20 points, moderate; and more than 20 points, large or substantial. For back-specific functional status, we classified mean improvements of 2 to 5 points on the RolandMorris Disability Questionnaire (RDQ; scale, 0 to 24) and 10 to 20 points on the Oswestry Disability Index (ODI; scale, 0 to 100) as moderate (11). We considered standardized mean differences of 0.2 to 0.5 to be small or slight; 0.5 to 0.8, moderate; and greater than 0.8, large (12). Some evidence suggests that our classification of mean improvements and standardized mean differences for pain and functional status are roughly concordant in patients with low back pain (1318). Because few trials reported the proportion of patients meeting specific thresholds (such as >30% reduction in pain score) for target outcomes, it was usually not possible to report numbers needed to treat for benefit. When those were reported, we considered a relative risk of 1.25 to 2.00 for the proportion of patients reporting greater than 30% pain relief to indicate a moderate benefit. Two reviewers independently rated the quality of each included trial. Discrepancies were resolved through joint review and a consensus process. We assessed internal validity (quality) of systematic reviews by using the Oxman criteria (Appendix Table 2) (19, 20). According to this system, systematic reviews receiving a score of 4 or less (on a scale of 1 to 7) have potential major flaws and are more likely to produce positive conclusions about effectiveness of interventions (20, 21). We classified such systematic reviews as lower quality; those receiving scores of 5 or more were graded as higher quality. Appendix Table 2. Quality Rating System for Systematic Reviews We did not abstract results of individual trials if they were included in a higher-quality systematic review. Instead, we relied on results and quality ratings for the trials as reported by the systematic reviews. We considered trials receiving more than half of the maximum possible quality score to be higher quality for any quality rating system used (22, 23). We assessed internal validity of randomized clinical trials not included in a higher-quality systematic review by using the criteria of the Cochrane Back Review Group (Appendix Table 3) (24). When blinding was not feasible, we removed blinding of providers (for studies of acupuncture, spinal manipulation, and massage) or blinding of patients and providers (for studies of back schools, exercise, psychological interventions, interdisciplinary rehabilitation, and functional restoration) as a quality criterion; thus, the maximum score was 10 or 9, respectively. We considered trials receiving more than half of the total possible score to be higher quality and those receiving less than or equal to half to be lower quality (22, 23). Appendix Table 3. Quality Rating System for Randomized, Controlled Trials Data Synthesis We assessed overall strength of evidence for a body of evidence by using methods adapted from the U.S. Preventive Services Task Force (25). To assign an overall strength of evidence (good, fair, or poor), we considered the number, quality and size of studies; consistency of results among studies; and directness of evidence. Minimum criteria for fair and good quality ratings are shown in Appendix Table 4. Appendix Table 4. Methods for Grading the Overall Strength of Evidence for an Intervention Consistent results from many higher-quality studies across a broad range of populations support a high degree of certainty that the results of the studies are true (the entire body of evidence would be considered good quality). For a fair-quality body of evidence, results could be due to true effects or to biases operating across some or all of the studies. For a poor-quality body of evidence, any conclusion is uncertain. To evaluate consistency, we classified conclusions of trials and systematic reviews as positive (the therapy is beneficial), negative (the therapy is harmful or not beneficial), or uncertain (the estimates are imprecise, the evidence unclear, or the results inconsistent) (20). We defined inconsistency as greater than 25% of trials reaching discordant conclusions (positive vs. negative), 2 or more higher-quality systematic reviews reaching discordant conclusions, or unexplained heterogeneity (for pooled data). Role of the Funding Source The funding source had no role in the design, conduct, or reporting of this review or in the decision to publish the manuscript. Results Size of Literature Reviewed We reviewed 1292 abstracts identified by searches for systematic reviews of low back pain. Of these, 96 seemed potentially relevant and were retrieved. A total of 40 systematic reviews (2670) met inclusion criteria (see Appendix Table 5 for quality ratings and Appendix Table 6 for characteristics and results of the systematic reviews that evaluated efficacy). We excluded 59 systematic reviews (71129), most frequently because they met our criteria for outdated reviews or did not report results for patients with low back pain (Appendix Table 7). Five recent, large (>200 patients) tr

itation are all moderately effective for chronic or subacute (Ͼ4 weeks' duration) low back pain. Benefits over placebo, sham therapy, or no treatment averaged 10 to 20 points on a 100-point visual analogue pain scale, 2 to 4 points on the Roland-Morris Disability Questionnaire, or a standardized mean difference of 0.5 to 0.8. We found fair evidence that acupuncture, massage, yoga (Viniyoga), and functional restoration are also effective for chronic low back pain. For acute low back pain (Ͻ4 weeks' duration), the only nonpharmacologic therapies with evidence of efficacy are superficial heat (good evidence for moderate benefits) and spinal manipulation (fair evidence for small to moderate benefits). Although serious harms seemed to be rare, data on harms were poorly reported. No trials addressed optimal sequencing of therapies, and methods for tailoring therapy to individual patients are still in early stages of development. Evidence is insufficient to evaluate the efficacy of therapies for sciatica.
Limitations: Our primary source of data was systematic reviews. We included non-English-language trials only if they were included in English-language systematic reviews.
Conclusions: Therapies with good evidence of moderate efficacy for chronic or subacute low back pain are cognitive-behavioral therapy, exercise, spinal manipulation, and interdisciplinary rehabilitation. For acute low back pain, the only therapy with good evidence of efficacy is superficial heat.
M any nonpharmacologic therapies are available for treatment of low back pain. In 1 study of primary care clinicians, 65% reported recommending massage therapy; 55% recommended therapeutic ultrasonography; and 22% recommended, prescribed, or performed spinal manipulation (1). In another study, 38% of patients with spine disorders were referred to a physical therapist for exercise therapy, physical therapies, or other interventions (2). Other noninvasive interventions are also available, including psychological therapies, back schools, yoga, and interdisciplinary therapy.
Clinicians managing low back pain vary substantially in the noninvasive therapies they recommend (3). Although earlier reviews found little evidence demonstrating efficacy of most noninvasive therapies for low back pain (4 -6), many more randomized trials are now available. This article summarizes current evidence on noninvasive therapies for low back pain in adults. It is part of a larger evidence review commissioned by the American Pain Society and the American College of Physicians to guide rec-ommendations for management of low back pain (7). Pharmacologic therapies are reviewed in a separate article in this issue (8).
nonpharmacologic therapies would be included in this review. Appendix Table 1 (available at www.annals.org) shows the 17 therapies chosen by the panel and how we defined and grouped them. Several therapies that have not been studied in the United States or are not widely available (such as acupressure, neuroreflexotherapy, spa therapy, and percutaneous electrical nerve stimulation) are reviewed in the complete evidence review (7). Therapies solely involving advice or back education are also reviewed separately, as are surgical and interventional pain procedures.
We searched MEDLINE (1966through November 2006 and the Cochrane Database of Systematic Reviews (2006, Issue 4) for relevant systematic reviews, combining terms for low back pain with a search strategy for identifying systematic reviews. When higher-quality systematic reviews were not available for a particular intervention, we conducted additional searches for primary studies (combining terms for low back pain with the therapy of interest) on MEDLINE, the Cochrane Central Register of Controlled Trials, and PEDro. Full details of the search strategies are available in the complete evidence report (7). Electronic searches were supplemented by reference lists and additional citations suggested by experts. We did not include trials published only as conference abstracts.

Evidence Selection
We included all randomized, controlled trials meeting all of the following criteria: 1) reported in English, or in a non-English language but included in an English-language systematic review; 2) evaluated nonpregnant adults (Ͼ18 years of age) with low back pain (alone or with leg pain) of any duration; 3) evaluated a target therapy; and 4) reported at least 1 of the following outcomes: back-specific function, generic health status, pain, work disability, or patient satisfaction (9, 10).
We excluded trials of low back pain associated with acute major trauma, cancer, infection, the cauda equina syndrome, fibromyalgia, and osteoporosis or vertebral compression fracture.
Because of the large number of studies on therapies for low back pain, our primary source for trials was systematic reviews. When multiple systematic reviews were available for a target therapy, we excluded outdated systematic reviews, which we defined as systematic reviews with a published update or those published before 2000. When a higher-quality systematic review was not available for a particular therapy, we included all relevant randomized, controlled trials. We also supplemented systematic reviews with data from recent, large (Ͼ250 patients) trials.

Data Extraction and Quality Assessment
For each included systematic review, we abstracted information on search methods; inclusion criteria; methods for rating study quality; characteristics of included studies; methods for synthesizing data; and results, including the number and quality of trials for each comparison and outcome in patients with acute (Ͻ4 weeks' duration) low back pain, chronic/subacute (Ͼ4 weeks' duration) low back pain, and back pain with sciatica. If specific data on duration of trials were not provided, we relied on the categorization (acute or chronic/subacute) assigned by the systematic review. For each trial not included in a systematic review, we abstracted information on study design, participant characteristics, interventions, and results.
We considered mean improvements of 5 to 10 points on a 100-point visual analogue pain scale (or equivalent) to be small or slight; 10 to 20 points, moderate; and more than 20 points, large or substantial. For back-specific functional status, we classified mean improvements of 2 to 5 points on the Roland-Morris Disability Questionnaire (RDQ; scale, 0 to 24) and 10 to 20 points on the Oswestry Disability Index (ODI; scale, 0 to 100) as moderate (11). We considered standardized mean differences of 0.2 to 0.5 to be small or slight; 0.5 to 0.8, moderate; and greater than 0.8, large (12). Some evidence suggests that our classification of mean improvements and standardized mean differences for pain and functional status are roughly concordant in patients with low back pain (13)(14)(15)(16)(17)(18). Because few trials reported the proportion of patients meeting specific thresholds (such as Ͼ30% reduction in pain score) for target outcomes, it was usually not possible to report numbers needed to treat for benefit. When those were reported, we considered a relative risk of 1.25 to 2.00 for the proportion of patients reporting greater than 30% pain relief to indicate a moderate benefit.
Two reviewers independently rated the quality of each included trial. Discrepancies were resolved through joint review and a consensus process. We assessed internal validity (quality) of systematic reviews by using the Oxman criteria (Appendix Table 2, available at www.annals.org) (19,20). According to this system, systematic reviews receiving a score of 4 or less (on a scale of 1 to 7) have potential major flaws and are more likely to produce positive conclusions about effectiveness of interventions (20,21). We classified such systematic reviews as "lower quality"; those receiving scores of 5 or more were graded as "higher quality. " We did not abstract results of individual trials if they were included in a higher-quality systematic review. Instead, we relied on results and quality ratings for the trials as reported by the systematic reviews. We considered trials receiving more than half of the maximum possible quality score to be "higher quality" for any quality rating system used (22,23).
We assessed internal validity of randomized clinical trials not included in a higher-quality systematic review by using the criteria of the Cochrane Back Review Group (Appendix Table 3, available at www.annals.org) (24). When blinding was not feasible, we removed blinding of providers (for studies of acupuncture, spinal manipulation, and massage) or blinding of patients and providers (for studies of back schools, exercise, psychological interventions, interdisciplinary rehabilitation, and functional resto-ration) as a quality criterion; thus, the maximum score was 10 or 9, respectively. We considered trials receiving more than half of the total possible score to be "higher quality" and those receiving less than or equal to half to be "lower quality" (22, 23).

Data Synthesis
We assessed overall strength of evidence for a body of evidence by using methods adapted from the U.S. Preventive Services Task Force (25). To assign an overall strength of evidence (good, fair, or poor), we considered the number, quality and size of studies; consistency of results among studies; and directness of evidence. Minimum criteria for fair and good quality ratings are shown in Appendix Table 4 (available at www.annals.org).
Consistent results from many higher-quality studies across a broad range of populations support a high degree of certainty that the results of the studies are true (the entire body of evidence would be considered good quality). For a fair-quality body of evidence, results could be due to true effects or to biases operating across some or all of the studies. For a poor-quality body of evidence, any conclusion is uncertain.
To evaluate consistency, we classified conclusions of trials and systematic reviews as positive (the therapy is beneficial), negative (the therapy is harmful or not beneficial), or uncertain (the estimates are imprecise, the evidence unclear, or the results inconsistent) (20). We defined "inconsistency" as greater than 25% of trials reaching discordant conclusions (positive vs. negative), 2 or more higherquality systematic reviews reaching discordant conclusions, or unexplained heterogeneity (for pooled data).

Role of the Funding Source
The funding source had no role in the design, conduct, or reporting of this review or in the decision to publish the manuscript.

Size of Literature Reviewed
We reviewed 1292 abstracts identified by searches for systematic reviews of low back pain. Of these, 96 seemed potentially relevant and were retrieved. A total of 40 systematic reviews (26 -70) met inclusion criteria (see Appendix Table 5 for quality ratings and Appendix Table 6 for characteristics and results of the systematic reviews that evaluated efficacy; both are available at www.annals.org). We excluded 59 systematic reviews (71-129), most frequently because they met our criteria for outdated reviews or did not report results for patients with low back pain (Appendix Table 7, available at www.annals.org). Five recent, large (Ͼ200 patients) trials of acupuncture (130 -132) and spinal manipulation or exercise (133, 134) supplemented the systematic reviews.
For acute low back pain, a higher-quality Cochrane review found spinal manipulation to be slightly to moderately superior to sham manipulation for short-term pain relief in a meta-regression analysis (weighted mean difference, Ϫ10 points on a 100-point visual analogue scale [95% CI, Ϫ17 to Ϫ2 points]) (15, 55). However, this estimate is mainly based on a lower-quality trial of patients with acute or subacute sacroiliac pain (154). Short-term effects on the RDQ (2 trials, 1 higher-quality) were moderate but did not reach statistical significance (weighted mean difference, Ϫ2.8 points [CI, Ϫ5.6 to 0.1 points]). Differences between spinal manipulation and therapies judged ineffective or harmful (traction, bed rest, home care, topical gel, no treatment, diathermy, and minimal massage) did not reach clinical significance for pain (weighted mean difference, Ϫ4 points [CI, Ϫ8 to Ϫ1 points]) and reached clinical but not statistical significance on the RDQ (weighted mean difference, Ϫ2.1 points [CI, Ϫ4.4 to 0.2 points]). There were no clear differences between spinal manipulation and usual care or analgesics (3 trials), physical therapy or exercises (5 trials), and back schools (2 trials).
For chronic low back pain, the Cochrane review found spinal manipulation moderately superior to sham manipulation (3 trials) and therapies thought to be ineffective or harmful (5 trials). Against sham manipulation, differences in short-and long-term pain averaged 10 and 19 points on a 100-point visual analogue scale, and differences for shortterm function averaged 3.3 points on the RDQ. There were no differences between manipulation and general practitioner care or analgesics (6 trials), physical therapy or exercises (4 trials), and back school (3 trials). Evidence was insufficient to conclude that effectiveness of spinal manipulation varies depending on the presence or absence of radiating pain or the profession or training of the manipulator.
Five higher-quality systematic reviews reached conclusions generally consistent with those of the Cochrane review (58,60,61,69,70). Two recent, large trials (133, 134) not included in the systematic reviews also reported consistent results (Appendix Table 8, available at www .annals.org [130,[132][133][134]155]). For low back pain of Clinical Guidelines Nonpharmacologic Therapies for Acute and Chronic Low Back Pain unspecified duration, 1 higher-quality trial (681 patients) found no differences in pain, functional status, or other outcomes between patients randomly assigned to chiropractic versus medical management (133). The other trial (1334 patients) found spinal manipulation to be slightly superior to usual care for pain and disability (about 5 points on 100-point scales) after 3 months in patients with subacute or chronic low back pain, although effects were not as pronounced after 12 months, and differences on the RDQ did not reach clinical significance (about 1 point) (134). Manipulation and exercise did not significantly differ, and the addition of manipulation to exercise therapy was no better than exercise alone.
Two lower-quality systematic reviews found spinal manipulation superior to some other effective interventions (57, 68). However, these conclusions were based on sparse data (1 to 3 trials, often lower-quality and often with small sample sizes).
Five systematic reviews consistently found that serious adverse events after spinal manipulation (such as worsening lumbar disc herniation or the cauda equina syndrome) were very rare (64 -67, 69). One systematic review found no serious complications reported in more than 70 controlled clinical trials (65). Including data from observational studies, the risk for a serious adverse event was estimated as less than 1 per 1 million patient visits (66, 67).
One higher-quality randomized trial evaluated a decision tool for identifying patients more likely to benefit from spinal manipulation (156). It found that patients who met at least 4 of 5 predefined criteria had a higher likelihood of greater than 50% improvement in ODI scores when randomly assigned to spinal manipulation (odds ratio [OR],60.8 [CI,5.2 to 704.7]) compared with those who had negative findings according to the rule who were randomly assigned to manipulation (OR,2.4 [CI,0.83 to 6.9]) and those with positive findings according to the rule who were randomly assigned to exercise (OR,1.0 [CI,0.28 to 3.6]). However, no studies have examined how applying the decision tool versus not using the tool affects clinical outcomes, and the decision tool may not be practical for many primary care settings because it requires the clinician to perform and interpret potentially unfamiliar physical examination maneuvers and administer a specific questionnaire. A more pragmatic version of the decision tool has not been prospectively validated (157).

Massage
Eight unique trials of massage were included in 2 systematic reviews (26,27,69). For acute low back pain, evidence is insufficient to determine efficacy of massage (1 lower-quality trial evaluating a minimal massage intervention [158]). One higher-quality trial found combined treatment with massage, exercise, and education to be superior to exercise and education alone for subacute or chronic low back pain 1 month after treatment (159).
For chronic low back pain, a higher-quality Cochrane review found no clear differences between massage and manipulation at the end of a course of treatment (3 lowerquality trials) (26,27). Superficial massage was inferior to transcutaneous electrical nerve stimulation (TENS) for relieving pain in 1 higher-quality trial (160). Single trials found massage similar in effectiveness to corsets and exercise and moderately superior to relaxation therapy, acupuncture, sham laser, and self-care education (26,27). Nearly all trials assessed outcomes only during or shortly after (within 1 month) a course of treatment. However, 1 higher-quality trial found that beneficial effects of massage compared with acupuncture and self-care education persisted for 1 year (161). Results of a second systematic review are consistent with the Cochrane review (69).
Only 1 trial (rated higher-quality) directly compared different massage techniques. It found acupuncture massage superior to classical (Swedish) massage (162). Massage seemed more effective in trials that used a trained massage therapist with many years of experience or a licensed massage therapist (26,27). Evidence was insufficient to determine effects of the number or duration of massage sessions on efficacy. Several trials with negative results evaluated superficial massage techniques, brief treatment sessions (10 to 15 minutes), or few sessions (Ͻ5).

Acupuncture
Fifty-one unique trials on efficacy of acupuncture were included in 3 systematic reviews (16 -18, 69). All of the systematic reviews identified substantial methodological shortcomings in most trials. About one third of the trials were conducted in Asia. A fourth systematic review focused on adverse events associated with acupuncture and included observational studies (163).
For acute low back pain, 2 higher-quality systematic reviews found sparse, inconclusive evidence from 4 small trials on efficacy of acupuncture versus sham acupuncture or other interventions (16 -18).
For chronic low back pain, both systematic reviews found acupuncture moderately more effective than no treatment or sham treatments for short-term (Ͻ6 weeks' [16] or Ͻ3 months' [17,18] duration) pain relief. Acupuncture was also associated with moderate short-term improvements in functional status compared with no treatment (standardized mean differences, 0.62 [CI,0. [17,18]), but not compared with sham therapies. A recent, higher-quality trial not included in the systematic reviews found no differences between acupuncture and sham acupuncture for pain or function (Appendix Table 8, available at www .annals.org) (130).
Evidence of longer-term benefits from acupuncture is mixed. Acupuncture was moderately superior for longterm (Ͼ6 weeks' duration) pain relief compared with sham TENS in 2 trials and compared with no additional treat-ment in 5 trials, although there were no significant differences compared with sham acupuncture (16). One higherquality trial found no differences in pain 1 year after acupuncture therapy compared with provision of a self-care education book (161). A higher-quality trial not included in the systematic reviews found clinically insignificant differences (Ͻ5 points on 100-point scales) between acupuncture and no acupuncture for pain and function after 6 months (Appendix Table 8, available at www.annals.org) (132). Another recent, higher-quality trial found acupuncture slightly superior to usual care on Short Form-36 pain scores after 24 months (weighted mean difference, 8 points [CI, 0.7 to 15.3 points]) and for recent use of medications for low back pain (60% vs. 41%), although ODI scores and other outcomes did not differ (131).
Efficacy does not clearly differ between acupuncture and massage, analgesic medication, or TENS (each evaluated in 1 to 4 trials) (16 -18). Although 2 trials found acupuncture inferior to spinal manipulation for short-term pain relief, both were rated lower-quality (16). The addition of acupuncture to a variety of noninvasive interventions significantly improved pain and function through 3 to 12 months in 4 higher-quality trials (17, 18).
Few higher-quality trials directly compared different acupuncture techniques. One trial found deep-stimulation acupuncture to be superior to superficial stimulation for immediate outcomes (164). Another found no difference between manual acupuncture and electroacupuncture (165).
Only 14 of 35 trials of acupuncture reported any complications or side effects (17,18). Minor complications occurred in 5% (13 of 245) of patients receiving acupuncture. A systematic review of acupuncture for various conditions (data from Ͼ250 000 treatments) found wide variation in rates of adverse events, ranging from 1% to 45% for needle pain and 0.03% to 38% for bleeding (163). Feelings of faintness and syncope occurred after 0% to 0.3% of treatments. Serious adverse events were rare. Pneumothorax was reported in 2 patients, and there were no cases of infections.

Exercise Therapy, Yoga, and Back Schools Exercise Therapy
Seventy-nine unique trials of exercise therapy were included in 6 systematic reviews (34 -40).
For acute low back pain, a higher-quality Cochrane review found exercise therapy superior to usual care or no treatment in 2 of 9 trials (35, 36). Among trials that could be pooled, exercise therapy and no exercise did not differ for pain relief or functional outcomes. There were also no differences between exercise therapy and other noninvasive treatments for acute low back pain or between exercise therapy and placebo or usual care for subacute low back pain.
For chronic low back pain (43 trials), the Cochrane review found exercise slightly to moderately superior to no treatment for pain relief at earliest follow-up (weighted mean difference, 10 points on a 100-point scale [CI,1.31 to 19.09 points]), although not for functional outcomes (35,36). Results were similar at later follow-up. Exercise therapy was associated with statistically significant but small effects on pain (weighted mean difference, 5.93 points [CI, 2.21 to 9.65 points]) and function (weighted mean difference, 2.37 points [CI, 0.74 to 4.0 points]) compared with other noninvasive interventions.
Three systematic reviews were less comprehensive than the Cochrane review but reached consistent conclusions (34,38,40). A fourth, higher-quality systematic review focusing on work outcomes (14 trials) found that exercise slightly reduced sick leave during the first year (standardized mean difference, Ϫ0.24 [CI, Ϫ0.36 to Ϫ0.11]) and decreased the proportion of patients who had not returned to work at 1 year (relative risk, 0.73 [CI, 0.56 to 0.95]), although no benefit was observed in the severely disabled subgroup (Ͼ90 days of sick leave) or in patients receiving disability payments (37).
Results of a large (1334 patients), recently published trial are consistent with those of the systematic reviews (Appendix Table 8, available at www.annals.org) (134). It found exercise therapy to be marginally superior to usual care for pain and disability in patients with low back pain for more than 28 days, but no differences were seen between exercise therapy and manipulation.
The authors of the Cochrane review also conducted a meta-regression analysis and found that exercise therapy using individualized regimens, supervision, stretching, and strengthening was associated with the best outcomes (36). They estimated that exercise therapy incorporating all of these features would improve pain scores by 18.1 points (95% credible interval, 11.1 to 25.0 points) compared with no treatment and would improve function by 5.5 points (95% credible interval, 0.5 to 10.5 points). However, no trials of such an intervention have been conducted. The Cochrane review also found addition of exercise to other noninvasive therapies to be associated with small improvements in pain (about 5 points on a 100-point scale) and function (about 2 points on a 100-point scale). One recent, higher-quality systematic review found no clear differences between the McKenzie method and other exercise regimens (39).

Yoga
We identified no systematic reviews of yoga for low back pain. From 27 citations, 3 trials (all in patients with chronic low back pain) met inclusion criteria (Appendix Table 9, available at www.annals.org) (151-153). One higher-quality trial (101 patients) found 6 weeks of Viniyoga (a therapeutically oriented style) to be slightly superior to conventional exercise (mean difference in RDQ scores, Ϫ1. 8  . Yoga was also associated with decreased medication use at week 26 (21% of patients) compared with exercise (50%) and the self-care book (59%), although the rate of back pain-related health care provider visits did not differ.
Two lower-quality, smaller trials (60 and 22 patients) evaluated Iyengar yoga, a commonly practiced style of Hatha yoga that frequently uses physical props (151, 153). Results were inconclusive. Although 1 trial found Iyengar yoga more effective than exercise instruction for reducing disability through 3 months after treatment, effects on pain were small and were statistically significant only when adjusted for baseline differences (153). The other, smaller trial found no significant differences between Iyengar yoga and standard exercise (151).

Back Schools
Thirty-one unique trials of back schools were included in 3 systematic reviews (28 -31). For acute or subacute low back pain, a higher-quality Cochrane review (19 trials) included 1 lower-quality trial (166) that found back schools superior to sham diathermy for short-term recovery and return to work, but not for pain or long-term recurrences (29, 30).
For chronic low back pain, the Cochrane review found inconsistent evidence on efficacy of back schools versus placebo or wait-list controls (8 trials), although most studies found no benefits (29, 30). Results were generally better in trials of back schools conducted in an occupational setting and for more intensive programs based on the original Swedish back school, although benefits were small. Conclusions of 2 other systematic reviews of back schools are consistent with those of the Cochrane review (28, 31).

Psychological Therapies, Interdisciplinary Rehabilitation, and Functional Restoration Psychological Therapies
Thirty-five unique trials of psychological therapies for chronic low back pain were included in 2 systematic reviews (32, 33). One of the systematic reviews included trials of psychological therapies as part of interdisciplinary therapy (32).
A higher-quality Cochrane review (33) included 4 trials (1 higher-quality [167]) that found cognitive-behavioral therapy to be moderately superior to a wait-list control for short-term pain intensity (standardized mean difference, 0.59 [CI, 0.10 to 1.09]), but not for functional status (standardized mean difference, 0.31 [CI, Ϫ0.20 to 0.82]). It also included 2 lower-quality trials that found progres-sive relaxation to be associated with large effects on shortterm pain (standardized mean difference, 1.16 [CI, 0.47 to 1.85]) and behavioral outcomes (standardized mean difference, 1.31 [CI, 0.61 to 2.01]). Results in the electromyography biofeedback group compared with those in the wait-list control group were mixed. Although 3 trials found biofeedback superior for pain intensity (standardized mean difference, 0.84 [CI, 0.32 to 1.35]), a fourth trial found no differences. There were no differences between patients receiving operant treatment and wait-list control participants. Conclusions of another higher-quality systematic review (22 trials) are consistent with those of the Cochrane review (32).
No differences were seen between psychological therapies and other active therapies (such as exercise or usual care) for most outcomes, although 1 systematic review found small to moderate effects on short-term (standardized mean difference, 0.36 [CI, 0.06 to 0.65]; 3 trials) and long-term (standardized mean difference, 0.53 [CI, 0.19 to 0.86]; 4 trials) disability (32).
Psychological therapies did not improve outcomes when added to a variety of other noninvasive therapies (6 lower-quality trials), although diversity in both psychological and nonpsychological therapies limits interpretability of this finding (33).

Interdisciplinary Rehabilitation and Functional Restoration
Twenty-eight unique trials were included in 4 systematic reviews of interdisciplinary rehabilitation (43-47) or functional restoration (41, 42). For subacute low back pain, a higher-quality Cochrane review found interdisciplinary rehabilitation with a workplace visit more effective than usual care for subacute low back pain, but only 2 lower-quality trials were included (45, 46).
For chronic low back pain, a second higher-quality Cochrane review included 3 trials (1 higher-quality) that found intensive (Ͼ100 hours), daily interdisciplinary rehabilitation to be moderately superior to noninterdisciplinary rehabilitation or usual care for short-and long-term functional status (standardized mean differences, Ϫ0.40 to Ϫ0.90 at 3 to 4 months and Ϫ0.56 to Ϫ1.07 at 60 months) (43, 44). Interdisciplinary rehabilitation was also moderately superior for pain outcomes at 3 to 4 months in 2 trials (standardized mean differences, Ϫ0.56 and Ϫ0.74, respectively), although long-term (60 months) results were inconsistent (standardized mean differences, Ϫ0.51 and 0.00, respectively) (168, 169). Evidence was also inconsistent regarding effects on return to work and sick leave. In contrast to more intensive interventions, less intensive interdisciplinary rehabilitation was no better than noninterdisciplinary rehabilitation or usual care (5 trials, 2 higher-quality) (43, 44). A smaller (5 trials) systematic review reported results consistent with those of the Cochrane review (47).
Functional restoration often involves a multidisci-

Clinical Guidelines
Nonpharmacologic Therapies for Acute and Chronic Low Back Pain www.annals.org plinary component (41,42). For acute low back pain, a higher-quality Cochrane review found functional restoration no better than usual care, normal activities, or standard exercise therapy in 3 trials (2 higher-quality) (41, 42). For chronic low back pain, the Cochrane review found functional restoration with a cognitive-behavioral component more effective than usual care, normal activities, or standard exercise therapy for reducing time lost from work, but little evidence that functional restoration without a cognitive-behavioral component is effective.

Interferential Therapy
We identified no systematic reviews of interferential therapy for low back pain. From 8 citations, 3 trials met inclusion criteria (Appendix Table 9, available at www .annals.org) (135-137). In 2 trials (1 higher-quality [136]), there were no clear differences between interferential therapy and either spinal manipulation or traction for subacute or chronic back pain (137). A third, lower-quality trial found interferential therapy superior to a self-care book for improvements in RDQ scores in patients with subacute low back pain, but it reported large baseline differences (135). Median RDQ scores after 3 months were identical in the 2 groups.

Low-Level Laser Therapy
We identified no systematic reviews of low-level laser therapy for low back pain. From 218 citations, 7 trials met inclusion criteria (Appendix Table 9) (138 -144). The trials were generally small (20 to 120 patients) and evaluated heterogeneous outcome measures and different types of lasers at varying doses. In addition, language or publication bias is possible because low-level laser therapy is more commonly used in Russia and Asia.
For chronic low back pain or back pain of unspecified duration, 4 trials (138, 141, 143, 144) (3 higher-quality) found laser therapy superior to sham for pain or functional status up to 1 year after treatment, but another higherquality trial (140) found no differences between laser and sham in patients also receiving exercise. One lower-quality trial found laser, exercise, and the combination of laser plus exercise similar for pain and back-specific functional status (139).
One trial reported 1 transient adverse event in both the laser and sham laser groups (138). In a systematic review of low-level laser therapy for various musculoskeletal conditions, 6 of 11 trials evaluating higher doses reported no adverse events (95).

Lumbar Supports
Six trials of lumbar supports for treatment of low back pain were included in a higher-quality Cochrane review (48,49). For low back pain of unspecified duration, the Cochrane review found insufficient evidence from 1 small (30 patients), lower-quality trial (170) to assess efficacy of a lumbar support compared with no lumbar support. For chronic or subacute low back pain, 1 higher-quality trial found lumbar support to be superior to superficial massage for RDQ scores, but not for ODI scores or pain relief (171, 172). There were no differences between lumbar support and spinal manipulation or transcutaneous muscular stimulation. Evidence from 3 lower-quality trials was insufficient to determine efficacy of lumbar supports compared with other interventions (48, 49).

Shortwave Diathermy
We identified no systematic reviews of shortwave diathermy for low back pain. From 14 citations, 3 lowerquality trials met inclusion criteria (Appendix Table 9, available at www.annals.org) (145-147). For acute low back pain, 1 small (24 patients) trial found shortwave diathermy to be inferior to spinal manipulation for pain relief after 2 weeks, but no details about the diathermy intervention were provided (146). For chronic low back pain (145) or low back pain lasting more than 1 week (147), 2 trials found no differences between shortwave diathermy versus sham diathermy or spinal manipulation (145) or shortwave diathermy versus sham diathermy, extension exercises, or traction (147).

Superficial Heat
Nine trials of superficial heat or cold were included in a higher-quality Cochrane review (50). For acute low back pain, the Cochrane review found consistent evidence from 3 higher-quality trials that heat wrap therapy or a heated blanket is moderately superior to placebo or a nonheated blanket for short-term pain relief and back-specific functional status. A higher-quality trial (173) also found heat wrap therapy to be moderately superior to oral acetaminophen or ibuprofen for short-term (3 to 4 days' duration) pain relief (differences of 0.66 and 0.93 on a 6-point scale, respectively) and RDQ scores (differences of about 2 points). For acute low back pain, another higher-quality trial (174) found heat wrap therapy superior to an educational booklet, but not exercise, for early pain relief, although benefits were no longer present after 1 week. Adverse events in trials of superficial heat were minor and mainly consisted of mild skin irritation (50).

Traction
Twenty-four unique trials of traction were included in 3 systematic reviews (51-53, 70). For low back pain of varying duration (with or without sciatica) a higher-quality Cochrane review included 2 higher-quality trials (175-177) that found traction no more effective than placebo, sham, or no treatment for any reported outcome (51, 52). For sciatica of mixed duration, autotraction was more effective than placebo, sham, or no treatment in 2 lower-quality trials (178, 179), but continuous or intermittent traction was not effective (8 trials, 1 higher-quality [180]). There was no clear evidence that various types of traction are Clinical Guidelines Nonpharmacologic Therapies for Acute and Chronic Low Back Pain more effective than other interventions (51, 52). Two other systematic reviews found no evidence traction is effective (70) or insufficient evidence to draw reliable conclusions (53).
Adverse events associated with traction include aggravation of neurologic signs and symptoms and subsequent surgery, but these were inconsistently and poorly reported (harms were not mentioned in 16 of 24 trials) (51, 52).

TENS
Eleven unique trials of TENS were included in a higherquality Cochrane review of TENS (54) and 5 systematic reviews of other interventions (15, 16, 26, 27, 50 -52, 55). For chronic low back pain, the Cochrane review included 1 lower-quality trial that found TENS superior to placebo, but a larger, higher-quality trial (181) found no differences between TENS and sham TENS for any measured outcome (54). A systematic review of acupuncture for low back pain also found no difference in short-or long-term pain relief between TENS and acupuncture in 4 trials (16). One higher-quality trial found TENS superior to superficial massage (160). Evidence from single, lower-quality trials is insufficient to accurately judge efficacy of TENS versus other interventions for chronic low back pain or for acute low back pain. For subacute low back pain, 1 higherquality trial found TENS moderately inferior to spinal manipulation for subacute low back pain (171, 172).
The Cochrane review found that one third of patients randomly assigned to either active or sham TENS had minor skin irritation, with 1 patient (in the sham group) discontinuing therapy because of severe dermatitis (54).

Ultrasonography
We identified no systematic reviews of ultrasonography for low back pain. From 265 potentially relevant citations, 3 lower-quality trials met inclusion criteria (Appendix Table 9, available at www.annals.org) (148 -150). For chronic low back pain (148) or low back pain of unspecified duration (150), 2 small (10 and 36 patients, respectively) trials reported inconsistent results for ultrasonography versus sham ultrasonography, with the larger trial reporting no differences. For acute sciatica, a nonrandomized trial (73 patients) found ultrasonography superior to sham ultrasonography or analgesics for pain relief, with patients in all groups also prescribed bed rest (149).

DISCUSSION
This review synthesizes evidence from systematic reviews and randomized, controlled trials of 17 nonpharmacologic therapies for low back pain. Nearly all therapies were evaluated in patients with nonspecific low back pain or in mixed populations of patients with and without sciatica. Main results are summarized in Appendix Table 10 (acute low back pain), Appendix Table 11 (chronic or subacute low back pain), and Appendix Table 12 (back pain with sciatica) (all appendix tables are available at www .annals.org).
We found good evidence that psychological interventions (cognitive-behavioral therapy and progressive relaxation), exercise, interdisciplinary rehabilitation, functional restoration, and spinal manipulation are effective for chronic or subacute (Ͼ4 weeks' duration) low back pain. Compared with placebo or sham therapies, these interventions were associated with moderate effects, with differences for pain relief in the range of 10 to 20 points on a 100-point visual analogue pain scale, 2 to 4 points on the RDQ, or a standardized mean difference of 0.5 to 0.8. The exception was exercise therapy, which was associated with small to moderate (10 points on a 100-point visual analogue pain scale) effects on pain. We found fair evidence that acupuncture is more effective than sham acupuncture, and fair evidence that massage is similar in efficacy to other noninvasive interventions for chronic low back pain. We found little evidence of clinically meaningful, consistent differences between most interventions found effective. One exception was intensive interdisciplinary rehabilitation, which was moderately more effective than noninterdisciplinary rehabilitation for improving pain and function. We also found fair evidence that Viniyoga is slightly superior to traditional exercises for functional status and use of analgesic medications.
For acute low back pain (Ͻ4 weeks' duration), the only nonpharmacologic therapies with evidence of efficacy are superficial heat (good evidence for moderate benefits) and spinal manipulation (fair evidence for small to moderate benefits). Other noninvasive therapies (back schools, interferential therapy, low-level laser therapy, lumbar supports, TENS, traction, and ultrasonography) have not been shown to be effective for either chronic or subacute or acute low back pain.
We found only rare reports of serious adverse events for all of the noninvasive therapies evaluated in this review. However, assessment and reporting of harms were generally suboptimal. For example, less than half of the trials of acupuncture reported adverse events (17). Better reporting of harms is needed for more balanced assessments of interventions (182).
Our evidence synthesis has several potential limitations. First, because of the large number of published trials, our primary source of data was systematic reviews. The reliability of systematic reviews depends on how well they are conducted. We therefore focused on findings from higher-quality systematic reviews, which are less likely than lower-quality systematic reviews to report positive findings (20, 21). In addition, when multiple recent systematic reviews were available for an intervention, we found overall conclusions to be generally consistent. Second, we only included randomized, controlled trials for assessments of efficacy. Although well-conducted randomized, controlled trials are less susceptible to bias than other study designs, nearly all trials were conducted in ideal settings and se-

Clinical Guidelines
Nonpharmacologic Therapies for Acute and Chronic Low Back Pain www.annals.org lected populations, usually with short-term follow-up. "Effectiveness" trials in less highly selected populations could provide additional information on benefits in real-world practice. Third, language bias could affect our results because we included non-English-language trials only if already included in English-language systematic reviews. However, systematic reviews of acupuncture included Asian-language trials (16, 17), and systematic reviews of other interventions with no language restrictions identified few non-English-language studies (55, 183). Fourth, reliable assessments for potential publication bias were not possible for most of the interventions included in this review because of small numbers of trials (184). For the interventions evaluated in the most trials, assessments of potential publication bias varied. Funnel plot asymmetry was present in trials of exercise therapy (36), was not present in trials of spinal manipulation (15) or behavioral therapy (32), and could not be reliably interpreted for trials of acupuncture (16). Finally, we did not include costeffectiveness analyses. Although many noninvasive interventions for chronic low back pain appear to have similar effects on clinical outcomes, other factors, such as cost or convenience, may vary widely. However, systematic reviews of economic analyses of low back pain interventions have found few full cost-effectiveness analyses and important methodological deficiencies in the available cost studies (185-188).
We also identified several research gaps that limited our ability to reach more definitive conclusions about optimal use of the interventions included in this review. We found no trials on optimal sequencing of interventions, and only limited evidence on methods to guide selection of therapy for individual patients. Although initial studies are promising, decision tools and other methods for individualizing and selecting optimal therapy are still in fairly early stages of development (156). More research on methods for selecting optimal therapy that are practical for use by primary care clinicians is urgently needed. We also found few trials assessing efficacy of adding one noninvasive intervention to another. Although several trials found acupuncture plus another therapy to be more effective than the other therapy alone, other trials found little or no additional benefit from adding exercise therapy (36), behavioral interventions (33), or spinal manipulation (134) to other therapies. Finally, few trials specifically evaluated patients with sciatica (Appendix Table 12, available at www .annals.org) or spinal stenosis. One systematic review of interventions for sciatica identified only 8 trials of therapies included in this review (70). Most trials included in our review enrolled mixed populations of patients with or without sciatica, or did not enroll patients with sciatica. It remains unclear whether optimal nonpharmacologic treatments for sciatica or spinal stenosis differ from those for nonspecific low back pain, although in the case of spinal manipulation, presence or absence of radiating pain did not appear to affect conclusions (55).
In summary, evidence of effective nonpharmacologic therapies for acute low back pain is quite limited. This is not surprising, as the natural history of acute low back pain is for substantial early improvement in most patients (125). On the other hand, several noninvasive therapies seem to be similarly effective for chronic low back pain. Although evidence on effectiveness of therapies specifically for subacute low back pain is sparse (125), many trials enrolled mixed populations of patients with subacute and chronic low back pain. Factors to consider when choosing among noninvasive therapies are patient preferences, cost, convenience, and availability of skilled providers for specific therapies. Clinicians should avoid interventions not proven effective, as many therapies have at least fair evidence of moderate benefits.

Appendix Table 1. Included Interventions Intervention Definition
Spinal manipulation Manual therapy in which loads are applied to the spine using short-or long-lever methods. High-velocity thrusts are applied to a spinal joint beyond its restricted range of movement. Spinal mobilization, or low-velocity, passive movements within or at the limit of joint range, is often used in conjunction with spinal manipulation.

Massage
Soft tissue manipulation using the hands or a mechanical device through a variety of specific methods. Acupuncture An intervention consisting of the insertion of needles at specific acupuncture points. Exercise therapy A supervised exercise program or formal home exercise regimen, ranging from programs aimed at general physical fitness or aerobic exercise to programs aimed at muscle strengthening, flexibility, or stretching. Yoga An intervention distinguished from traditional exercise therapy by the use of specific body positions, breathing techniques, and emphasis on mental focus. Many styles of yoga are practiced, each emphasizing different postures and techniques. Back schools An intervention consisting of an education and a skills program, including exercise therapy, in which all lessons are given to groups of patients and supervised by a paramedical therapist or medical specialist. The original Swedish back school was introduced by Zachrisson Forsell in 1969.

Psychological therapies
Includes biofeedback (the use of auditory and visual signals reflecting muscle tension or activity to inhibit or reduce the muscle activity), progressive relaxation (a technique that involves the deliberate tensing and relaxation of muscles to facilitate the recognition and release of muscle tension), and standard cognitive-behavioral and operant therapy. Interdisciplinary therapy (also called multidisciplinary therapy) An intervention that combines and coordinates physical, vocational, and behavioral components and is provided by multiple health care professionals with different clinical backgrounds. The intensity and content of interdisciplinary therapy varies widely. Functional restoration (also called physical conditioning, work hardening, or work conditioning) An intervention that involves simulated or actual work tests in a supervised environment in order to enhance job performance skills and improve strength, endurance, flexibility, and cardiovascular fitness in injured workers. Physical therapies Interferential therapy The superficial application of a medium-frequency alternating current modulated to produce low frequencies up to 150 Hz. Low-level laser therapy The superficial application of lasers at wavelengths of 632-904 nm. Optimal treatment parameters (wavelength, dosage, dose intensity) are uncertain. Lumbar supports A back brace or orthotic device worn to passively support the back. Shortwave diathermy Therapeutic elevation of the temperature of deep tissues by application of shortwave electromagnetic radiation with a frequency range of 10-100 MHz.

Superficial heat
The superficial application of heat to the lumbar area. Traction An intervention involving drawing or pulling to stretch the lumbar spine. A variety of methods are used and usually involve a harness around the lower rib cage and around the iliac crest, with the pulling action performed by using free weights and a pulley, motorized equipment, inversion techniques, or an overhead harness. Transcutaneous electrical nerve stimulation Use of a small battery-operated device to provide continuous electrical impulses via surface electrodes, with the goal of relieving symptoms by modifying pain perception. Ultrasonography The therapeutic application of high-frequency sound waves up to 3 MHz. The purpose of this index is to evaluate the scientific quality (i.e., adherence to scientific principles) of research overviews (review articles) published in the medical literature. It is not intended to measure literary quality, importance, relevance, originality, or other attributes of overviews.

Was the search comprehensive?
Was the search for evidence reasonably comprehensive? "Yes" if the review searches at least 2 databases and looks at other sources (e.g., reference lists, hand searches, queries of experts).

Were the inclusion criteria reported?
Were the criteria used for deciding which studies to include in the overview reported?

Was selection bias avoided?
Was bias in the selection of studies avoided? "Yes" if the review reports how many studies were identified by searches, numbers excluded, and appropriate reasons for excluding them (usually because of predefined inclusion/exclusion criteria).
The index is for assessing overviews of primary ("original") research on pragmatic questions regarding causation, diagnosis, prognosis, therapy, or prevention. A research overview is a survey of research. The same principles that apply to epidemiologic surveys apply to overviews: A question must be clearly specified; a target population identified and accessed; appropriate information obtained from that population in an unbiased fashion; and conclusions derived, sometimes with the help of formal statistical analysis, as is done in meta-analyses. The fundamental difference between overviews and epidemiologic studies is the unit of analysis, not the scientific issues that the questions in this index address.

Were the validity criteria reported?
Were the criteria used for assessing the validity of the included studies reported? 6. Was validity assessed appropriately?
Was the validity of all the studies referred to in the text assessed by using appropriate criteria (either in selecting studies for inclusion or in analyzing the studies that are cited)? "Yes" if the review reports validity assessment and did some type of analysis with it (e.g., sensitivity analysis of results according to quality ratings, excluded low-quality studies).
Because most published overviews do not include a methods section, it is difficult to answer some of the questions in the index. Base your answers, as much as possible, on information provided in the overview. If the methods that were used are reported incompletely relative to a specific question, score it as "can't tell," unless there is information in the overview to suggest that the criterion was or was not met.

Were the methods used to combine studies reported?
Were the methods used to combine the findings of the relevant studies (to reach a conclusion) reported? ЉYesЉ for studies that did qualitative analysis if report mentions that quantitative analysis was not possible and reasons that it could not be done, or if "best evidence" or some other grading of evidence scheme used. 8. Were the findings combined appropriately?
Were the findings of the relevant studies combined appropriately relative to the primary question the overview addresses? ЉYesЉ if the review performs a test for heterogeneity before pooling or does appropriate subgroup testing, appropriate sensitivity analysis, or other such analysis.
For question 8, if no attempt has been made to combine findings, and no statement is made regarding the inappropriateness of combining findings, check "No." If a summary (general) estimate is given anywhere in the abstract, the discussion, or the summary section of the paper, and it is not reported how that estimate was derived, mark "No" even if there is a statement regarding the limitations of combining the findings of the studies reviewed. If in doubt, mark "Can't tell." 9. Were the conclusions supported by the reported data?
Were the conclusions made by the author(s) supported by the data and/or analysis reported in the overview?
For an overview to be scored as "Yes" in question 9, data (not just citations) must be reported that support the main conclusions regarding the primary question(s) that the overview addresses. 10. What was the overall scientific quality of the overview?
How would you rate the scientific quality of this overview?
The score for question 10, the overall scientific quality, should be based on your answers to the first 9 questions. The following guidelines can be used to assist with deriving a summary score: If the "Can't tell" option is used 1 or more times on the preceding questions, a review is likely to have minor flaws at best and it is difficult to rule out major flaws (i.e., a score Յ4). If the "No" option is used on question 2, 4, 6, or 8, the review is likely to have major flaws (i.e., a score Յ3, depending on the number and degree of the flaws). The number of participants who are included in the study but did not complete the observation period or were not included in the analysis must be described and reasons given. If the percentage of withdrawals and dropouts does not exceed 15% and does not lead to substantial bias, a "yes" is scored.

Yes/No/Don't Know
J. Was the timing of the outcome assessment in all groups similar?
Timing of outcome assessment should be identical for all intervention groups and for all important outcome assessments.

Yes/No/Don't Know
K. Did the analysis include an intention-to-treat analysis? "Yes," if Ͻ5% of randomly assigned patients excluded All randomly assigned patients are reported/analyzed in the group they were allocated to by randomization for the most important moments of effect measurement (minus missing values), irrespective of nonadherence and co-interventions.

Yes/No/Don't Know
* This list includes only the 11 internal validity criteria that refer to characteristics of the study that might be related to selection bias (criteria A and B), performance bias (criteria D, E, G, and H), attrition bias (criteria I and K), and detection bias (criteria F and J). The internal validity criteria should be used to define methodological quality in the meta-analysis. † Adapted from methods developed by the Cochrane Back Review Group (24).