Kanaka D. Shetty, MD, MS; Jayanta Bhattacharya, MD, PhD
Disclaimer: The views expressed herein are those of the authors and do not necessarily reflect the views of the Department of Veterans Affairs, National Bureau of Economic Research, or the Center for Health Policy and Center for Primary Care and Outcomes Research.
Acknowledgments: The authors thank Drs. Alan Garber, Douglas Owens, Mark Hlatky, and Priya Pillutla for helpful comments on earlier versions of this manuscript.
Grant Support: Dr. Shetty was supported by a Department of Veterans Affairs Fellowship in Ambulatory Care Practice and Research. Dr. Bhattacharya was supported by the National Institute on Aging.
Potential Financial Conflicts of Interest: None disclosed.
Requests for Single Reprints: Kanaka D. Shetty, MD, MS, Veterans Affairs Palo Alto Health Care System, 3801 Miranda Avenue, Palo Alto, CA 94304.
Current Author Addresses: Drs. Shetty and Bhattacharya: Veterans Affairs Palo Alto Health Care System, 3801 Miranda Avenue, Palo Alto, CA 94304.
Author Contributions: Conception and design: K. Shetty, J. Bhattacharya.
Analysis and interpretation of the data: K. Shetty, J. Bhattacharya.
Drafting of the article: K. Shetty.
Critical revision of the article for important intellectual content: K. Shetty, J. Bhattacharya.
Final approval of the article: K. Shetty, J. Bhattacharya.
Provision of study materials or patients: K. Shetty.
Statistical expertise: K. Shetty, J. Bhattacharya.
Obtaining of funding: J. Bhattacharya.
Collection and assembly of data: K. Shetty.
Shetty KD, Bhattacharya J. Changes in Hospital Mortality Associated with Residency Work-Hour Regulations. Ann Intern Med. 2007;147:73-80. doi: 10.7326/0003-4819-147-2-200707170-00161
Download citation file:
Published: Ann Intern Med. 2007;147(2):73-80.
The health effects of regulations restricting housestaff work hours to 75 hours per week are largely unknown.
The authors measured inpatient mortality of 1 268 738 patients admitted to 551 hospitals with medical diagnoses and 243 207 patients admitted for surgical diagnoses before and after 2003, when the work-hour regulation took effect. After 2003, the mortality rate for medical patients—but not surgical patients—decreased more in teaching hospitals than nonteaching hospitals.
The authors classified a hospital as “teaching” if it had a residency program; however, some teaching hospitals have nonteaching patients.
Limiting residents' work hours was associated with lower in-hospital mortality rates for medical patients in teaching hospitals.
For more than 20 years, the Accreditation Council on Graduate Medical Education (ACGME) and other observers have been concerned with excessive resident duty hours (1–9). The ACGME imposed national restrictions on duty hours in 1987, New York State enacted hours restrictions in 1989, and the ACGME's Residency Review Committee for Internal Medicine adopted duty-hour limits in the 1990s. Despite these regulations, enforcement before 2003 was not stringent and excessive resident work hours were common. In part to avoid federal legislation, the ACGME approved new resident duty-hour regulations that became effective on 1 July 2003 (1). The regulations limited resident workweeks to 80 hours or fewer and limited continuous duty to 24 hours, with 6 additional hours for transfer of care.
The effect of these regulations is uncertain. An hours cap might reduce errors and mortality due to resident weariness. Alternatively, they might increase errors if they disrupt the continuity of patient care. Scholarly work on this question has been limited, and the conclusions of these studies were mixed. One study found that New York State's regulations did not affect mortality in patients who were admitted for pneumonia, congestive heart failure, or acute myocardial infarction (10). However, duty-hour violations were still common during the study period, making the results less relevant to current regulations. Another study indicated that interns working under a duty-hours cap made fewer errors than did a similar group under a traditional call system (6). This finding suggested that duty-hours caps improve patient outcomes, but the sample size was too small to show effects on mortality.
The regulations may not have altered outcomes if residency programs did not change working conditions. However, the official ACGME report and multiple program director surveys documented major changes in work hours in residency programs after July 2003 (2, 11–16). In addition, an independent survey reported that interns' average weekly work hours decreased from 70.7 to 66.6 hours despite widespread violations (5).
We retrospectively analyzed discharge data collected between 2001 and 2004 to determine whether the ACGME regulations were associated with changes in inpatient mortality. Unlike previous work, our study has the statistical power to detect small effects because we used a large, nationally representative data set with patients from nonteaching hospitals as a control group.
We used the Nationwide Inpatient Sample (NIS) from the Healthcare Cost and Utilization Project to assemble a nationally representative data set of hospital patients between 2001 and 2004. The NIS sampled approximately 20% of all community hospitals in the United States and included at least 7.4 million discharges annually since 2001. The NIS provided clinical and demographic information on each discharged patient, including age, sex, 10 diagnoses (1 principal and 9 secondary), principal procedure, month of admission, and income quartile of the patient's ZIP code. The NIS coded procedures and diagnoses according to the International Classification of Diseases, Ninth Revision. The study was exempt from approval by the Stanford University institutional review board because all patient-level NIS data were publicly available and deidentified. We used data from the American Medical Association to assign teaching status and calculate the number of residents for each hospital (4, 17–21).
Before beginning our analysis, we selected principal diagnoses and procedures that represented a broad variety of medical and surgical diagnoses, were associated with high mortality rates, and were common in both teaching and nonteaching hospitals. Using these criteria, we selected 20 medical diagnoses and 15 surgical diagnoses. Patients with these diagnoses made up 34% of the total number of deaths in the sample and 13% of the patients (Table 1). We classified patients as surgical if their principal procedure code represented a major surgery and as medical if their principal diagnosis corresponded to a major internal medicine diagnosis. In addition, patients who were admitted for medical diagnoses but later had major surgeries during the same hospitalization were classified as surgical patients.
We excluded populations with low mortality rates (including all pediatric and obstetric patients) because we lacked the statistical power to detect a significant effect in these groups. We also excluded patients transferred from other hospitals or correctional facilities. We assigned teaching status to hospitals by linking American Medical Association data on residency programs to NIS data on hospitals by using American Hospital Association identification numbers. We therefore excluded patients from states that did not permit the release of hospital identifiers (roughly 35% of the NIS). To create similar pre- and postregulation populations, we limited our final sample to patients from hospitals that admitted at least 100 patients both before (January 2001–June 2003) and after (July 2003–December 2004) the regulations were implemented.
Teaching status is a critical variable in our analysis. The NIS classified patients as “teaching” if the admitting hospital had any type of educational program. For example, the NIS teaching variable would have erroneously classified surgical patients at teaching hospitals as teaching, regardless of whether their primary providers were residents (under the supervision of attending physicians) or attending physicians (with little or no resident involvement). To reduce this error, we first classified all patients in each hospital as internal medicine, general surgery, urology, or orthopedics patients on the basis of their principal diagnosis. We then classified patients as teaching if the hospital had a corresponding internal medicine, general surgery, urology, or orthopedics residency program (based on American Medical Association records). In addition, in hospitals with family practice programs, we classified internal medicine patients as teaching.
For example, if a hospital had an internal medicine residency program, we analyzed all patients admitted for medical diagnoses as if they were admitted to a teaching hospital. Conversely, if the same hospital lacked a surgical residency program, we analyzed all patients admitted for surgical diagnoses as if they were admitted to a nonteaching hospital. In our model, hospitals could admit a mix of teaching and nonteaching patients depending on the hospital's mix of sponsored residency programs. However, all patients with the same diagnosis in one particular hospital would be analyzed as teaching or nonteaching unless the hospital gained (or lost) a residency program from one year to the next. With this procedure, misclassification of patients is still possible if a hospital has parallel teaching and nonteaching services. We later discuss the effect of misclassification on the interpretation of our results.
We used a multivariate logistic regression model to estimate the changes in mortality associated with the regulations. Changes in nonteaching hospitals after July 2003 were unrelated to the regulations and therefore reflected underlying trends. Thus, changes that occurred in nonteaching hospitals act as a control for the changes associated with regulations in teaching hospitals. To derive our estimate, we subtracted the mortality trend in the control group (patients from nonteaching hospitals) from the mortality trend in patients from teaching hospitals. This method has been previously described in the economic and medical literature as a “difference-in-differences” approach (22–24). We accounted for case-mix differences by adjusting each of our regression estimates for admitting diagnosis and all previous medical conditions included in the Charlson index except uncomplicated diabetes and peptic ulcer disease (25, 26). We included age in the regression along with indicator variables for sex, race, insurance status, year and month of admission, income quartile of the patient's ZIP code, and emergency admission.
Hospitals also have very different patient populations and staffs, which cannot be observed by using available data. We tested several models to address these differences. We tried a generalized linear mixed model, but a Hausman specification test revealed that the basic assumptions underlying this model were not met (27). We therefore used a logistic regression model that treated each hospital's unobserved effect as a distinct intercept. Estimation of several hundred variables (one for each hospital) can lead to inconsistent estimates, but we reduced this problem by including only hospitals with more than 200 study-eligible patients (28). In addition, we compared our results to those from a conditional logistic regression model (which does not require the estimation of multiple intercepts). We found similar odds ratios in both, but we elected against using a conditional logistic regression model because the results cannot be expressed as changes in probabilities.
Using the model estimates, we calculated the change in probability of death in patients from teaching hospitals and in patients from nonteaching hospitals. We estimated the change in mortality rate associated with the regulations by subtracting the change in probability of death for patients from nonteaching hospitals (the control group) from that in patients admitted to teaching hospitals (22, 29, 30). We also calculated the relative risk reduction by dividing the change in mortality by the pre–July 2003 death rate for teaching hospitals (22, 27).
Before starting our analysis, we chose to perform 2 primary analyses (medical and surgical patients) separately; we anticipated that the regulations would affect each group differently because of dissimilarities in the previous resident working conditions. We chose sensitivity analyses and subgroup analyses based on our primary results. We accounted for NIS sampling weights in all analyses, used 2-sided P values in hypothesis testing, and adjusted the variance for clustering of patients at the hospital level. We used Stata 9.2 (StataCorp, College Station, Texas) for all analyses.
As a sensitivity test, we used the Hochberg method to adjust for increased type I error associated with multiple comparisons within families of independent hypotheses (31). For example, in 1 family of subgroup analyses, we divided patients into 3 different age groups and analyzed the regulations separately for each group. There are 3 independent hypotheses in this family, so we used the Hochberg method to adjust the P values to account for this. We adjusted P values in similar ways for other families of hypotheses.
We tested several key assumptions underlying the regression model. First, recall that we did not directly observe whether a patient's primary provider was a resident or an attending physician. Instead, we imputed teaching status on the basis of the presence of a residency program (internal medicine, general surgery, urology, or orthopedics) at the hospital. To check the effect of this misclassification on our results, we separately analyzed patients from hospitals with fewer than 15 medical residents, between 15 and 40 medical residents, and more than 40 medical residents. Patients at hospitals with large residency programs are more likely to be cared for by residents. Thus, if misclassification is important, we should expect to find substantial differences among hospitals with residency programs of different sizes in our estimates.
Second, our approach relies on the assumption that nonteaching patients serve as an adequate control for mortality trends in the hospitalized population that are unrelated to the regulations. To test this assumption, we assumed (counterfactually) that the regulatory change took place at various times before July 2003. If our assumption is correct, we should find no association between these counterfactual regulatory changes and outcomes.
The funding sources had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; or preparation, review, or approval of the manuscript.
Tables 1 and 2 show the distribution of admitting diagnoses and key clinical characteristics. The relatively high mean mortality rates in both groups reflect the high risk associated with the included diagnoses. Table 3 shows hospital characteristics, including the number of hospitals with residency programs.
We found that the imposition of the hours cap was associated with a 0.25% decrease in absolute risk for death (P = 0.043), which corresponded to a 3.75% decrease in relative risk in medical patients per hospitalization (Table 4 and Figure 1, top).
Top. Medical patients. Middle. Medical patient subgroups. Bottom. Medical patients, previous years. Error bars represent 95% CIs.
The regulations correlated with decreased absolute mortality (change, −0.71%; P = 0.005) and relative risk (change, −6.97%) in medical teaching patients older than 80 years of age (Figure 1, top). We did not find statistically significant changes in patients 65 to 80 years of age or in patients younger than 65 years of age. We also found declines in several disparate populations, including patients with congestive heart failure, sepsis, pneumonia, and gastrointestinal bleeding. In addition, the regulations were associated with a 0.66% decline in mortality among patients admitted with infectious diseases (relative risk reduction, 7.94%; P = 0.007) (Table 4and Figure 1, middle).
We did not find statistically significant changes in surgical teaching patients (change in absolute mortality, 0.13%; relative change, 3.77%; P = 0.54) (Table 5 and Figure 2). We did not observe a statistically significant change in mortality among older surgical patients (change, 0.74%; P = 0.16). We found no statistically significant mortality changes when comparing surgical patients from hospitals with larger residency programs with those from hospitals without residency programs (change, 0.05%; P = 0.66) or with patients from hospitals with nonsurgical teaching programs (change, 0.32%; P = 0.23).
Error bars represent 95% CIs.
We used the Hochberg method to adjust for multiple comparisons within independent families of results. In the first group (medical and surgical patients), we rejected the null hypothesis of no change within medical patients, but at the 10% level. Even after the Hochberg adjustment, we rejected the null hypothesis of no effect for congestive heart failure, infectious diseases, and age greater than 80 years at the 5% level. For all surgical subgroups, after the Hochberg adjustment we failed to reject the null hypothesis of no association between the regulatory change and mortality.
To check for the effect of misclassifying patients' teaching status, we compared mortality changes in patients from teaching hospitals of varying program sizes (<15, 15–40, and >40 medical residents) with changes among patients from nonteaching hospitals (Table 4 and Figure 1, top). The point estimates suggested larger decreases in mortality in hospitals with more residents (and presumably fewer nonteaching patients misclassified as teaching patients), although none of these differences were statistically significant (with or without the Hochberg adjustment).
For our falsification tests, we counterfactually assumed that the regulatory change took place earlier than 2003 (Table 4 and Figure 1, bottom). There was a statistically significant decline in mortality between 1998 and 1999 (change, −0.42%; P = 0.034), but this decline was not significant even at the 10% level after the Hochberg adjustment. We did not observe statistically significant changes in other year-to-year comparisons or when we examined trends for the 4-year period encompassing 1998–1999 versus 2000–2001 (change, 0.08%; P = 0.67), with or without the Hochberg adjustment.
Despite concerns that the ACGME restrictions on work hours might have worsened outcomes, we found that the regulations correlated with a 0.25% decrease in mortality and a relative risk reduction of 3.75% for internal medicine patients at teaching hospitals. We initially hypothesized that mortality for medical and surgical patients would change after the regulations, and the data support this hypothesis for medical patients. For surgical patients, we conclude that the regulations were not associated with improved outcomes. We also examined post regulation mortality changes for various patient subgroups. After adjustment for multiple comparisons, the results were significant at the 5% level for patients admitted for congestive heart failure or infectious diseases and for patients older than 80 years of age. We regard the subgroup analyses as preliminary because these hypotheses were post hoc.
We generated these estimates with multivariate logistic regression in which we used changes in outcomes for nonteaching patients as a control for the changes induced by the regulations in patients from teaching hospitals. Various sensitivity analyses supported this approach and suggested stable relative mortality trends between patients from teaching and nonteaching hospitals in the 4 years (1998–2001) before the regulatory period.
It is striking that the regulations were associated with improved outcomes in medical patients but not in surgical patients. There are several possible explanations. First, the smaller sample size for surgical patients may have limited our power to detect statistically significant differences. Second, surgical residency programs may not have altered their working conditions substantially. Third, if the number of surgical residents remained fixed and each resident worked less, the average number of available providers would decline. Fourth, errors due to fatigue may have been counterbalanced by problems with transfers of care.
Our results differ from those of previous studies, which found no conclusive evidence that work-hour restrictions improved outcomes in multiple settings (32), including obstetric, gynecologic, and perinatal care (33) and cardiac surgery (34). Previous studies of the New York State experience also found no effect on mortality (10) or various patient safety indicators (35). However, these studies were all conducted in smaller populations (often single academic centers), limiting both their statistical power and their relevance to national changes. In addition, whether enforcement was strict enough to have permitted statistically significant effects to be seen is unclear.
Our study has several limitations. First, we assigned teaching status on the basis of hospital characteristics because the data did not include information on individual providers. However, misclassification of nonteaching patients in teaching hospitals would blur the distinction between teaching and nonteaching patients, leading to a conservative bias toward the null hypothesis. Furthermore, we included only patients with serious and complex diagnoses in the study. In teaching hospitals, residents usually participate in the care of such patients. Hence, there were probably relatively few misclassified patients. We also observed larger improvements in mortality in hospitals with more internal medicine residents, where we would expect to find the least misclassification bias.
Second, our estimates do not represent the mortality effect of changes in exposure to resident care alone, but rather the effect of all the changes induced by work-hour regulations. For example, the regulations induced many hospitals to shift care to attending physicians (including academic hospitalists) and probably increased the number of nonteaching patients in teaching hospitals.
Third, it is always possible in an observational study that unobserved characteristics could have biased the results. However, the use of a control group (patients in nonteaching hospitals), the sensitivity analyses, the short period under consideration, and the greater improvements seen in hospitals with more internal medicine residents all argue against this bias as an explanation for our results.
Fourth, we focused on the role of medical and surgical residents. Residents from other specialties may have played some role in the improved mortality among medical patients, but these physicians rarely have primary responsibility for medical inpatients.
Fifth, we considered only a limited set of high-risk groups, which makes it possible that mortality trends were different in other groups. We attempted to reduce this possibility by including a broad set of diagnoses, but further studies on low-risk populations (including pediatric and obstetric patients) and on larger samples of surgical patients are warranted.
Finally, data limitations permit us to say little about whether the regulations affected survival beyond discharge or how hospitals adjusted to the sudden decrease in available resident work hours.
We conclude by emphasizing findings that are worth exploring further. Our subgroup analyses suggested that the greatest associated mortality improvements were in patients older than 80 years of age and in patients admitted for infectious diseases. Future dedicated analyses of both results are justified. The regulations may have improved outcomes by shifting care from inexperienced residents to more experienced providers (such as academic hospitalists). If so, the regulations' long-term effect could be deleterious if physicians do not obtain sufficient experience and skills during training and make more errors after residency. Further study is needed to determine the regulations' long-term effect on both graduate medical education and patient care.
The In the Clinic® slide sets are owned and copyrighted by the American College of Physicians (ACP). All text, graphics, trademarks, and other intellectual property incorporated into the slide sets remain the sole and exclusive property of the ACP. The slide sets may be used only by the person who downloads or purchases them and only for the purpose of presenting them during not-for-profit educational activities. Users may incorporate the entire slide set or selected individual slides into their own teaching presentations but may not alter the content of the slides in any way or remove the ACP copyright notice. Users may make print copies for use as hand-outs for the audience the user is personally addressing but may not otherwise reproduce or distribute the slides by any means or media, including but not limited to sending them as e-mail attachments, posting them on Internet or Intranet sites, publishing them in meeting proceedings, or making them available for sale or distribution in any unauthorized form, without the express written permission of the ACP. Unauthorized use of the In the Clinic slide sets will constitute copyright infringement.
Other Audio Options: Download MP3
Results provided by:
Copyright © 2016 American College of Physicians. All Rights Reserved.
Print ISSN: 0003-4819 | Online ISSN: 1539-3704
Conditions of Use
This PDF is available to Subscribers Only