The full content of Annals is available to subscribers

Subscribe/Learn More  >
Statistical Methods |

Methods for Comparison of Cost Data

Xiao-Hua Zhou, PhD; Catherine A. Melfi, PhD; and Siu L. Hui, PhD
[+] Article, Author, and Disclosure Information

From the Regenstrief Institute for Health Care, Indiana University Medical Center, and Indiana University School of Medicine, Indianapolis, Indiana; and Eli Lilly and Company, Indianapolis, Indiana. Note: This article is one of a series of articles comprising an Annals of Internal Medicine supplement entitled “Measuring Quality, Outcomes, and Cost of Care Using Large Databases: The Sixth Regenstrief Conference.” To see a complete list of the articles included in this supplement, please view its Table of Contents. Acknowledgments: The authors thank Drs. Chris Callahan, William Tierney, Mick Murray, Paul Dexter, Marc Overhage, and Morris Weinberger and Mr. David Gillette for helpful comments. Requests for Reprints: X.H. Zhou, Division of Biostatistics, Department of Medicine, Indiana University School of Medicine, 699 West Drive, RR 135, Indianapolis, IN 46202-5119. Current Author Addresses: Drs. Zhou and Hui: Regenstrief Institute for Health Care, Indiana University Medical Center, and Division of Biostatistics, Department of Medicine, Indiana University School of Medicine, Riley Research Wing, RR 135, Indianapolis, IN 46202-5200. Dr. Melfi: Division of Health Services and Policy Research, Lilly Corporate Center, Drop Code 1850, Indianapolis, IN 46285.

Copyright ©2004 by the American College of Physicians

Ann Intern Med. 1997;127(8_Part_2):752-756. doi:10.7326/0003-4819-127-8_Part_2-199710151-00063
Text Size: A A A

Background: Researchers are increasingly interested in examining costs of care, and large administrative and clinical databases have made relevant data readily available. Because a few patients incur high costs relative to most patients, the distribution of cost data is often skewed. How robust are the usual methods of cost analysis against the skewed distribution of cost data?

Objective: To determine the methods commonly used for comparing cost data, describe their limitations, and provide an alternate method of analysis.

Design: Review of statistical methods used in studies of medical costs published in medical journals between January 1991 and January 1996. Description of a Z-score method appropriate for testing the equality of mean costs between two log-normal samples; and reanalysis of published two-sample comparison results done by using the Z-score method.

Results: For two-sample comparisons, three methods were commonly used: the Student t-test on untransformed costs, the Wilcoxon test on untransformed costs, and the Student t-test on log-transformed costs. The t-test on untransformed costs ignores the skewness in cost data, the Wilcoxon test ignores unequal variances, and the t-test on log-transformed costs tests the wrong null hypothesis unless variances in the log-scale are equal.

Eleven articles included two-sample tests and had enough information to allow reanalysis of the data using the Z-score method. These articles described a total of 23 Wilcoxon tests and 24 t-tests on untransformed costs. Most results did not change on reanalysis, but six results changed enough to alter conclusions. Specifically, reanalysis of data for which one Wilcoxon test had shown statistically significant results showed nonsignificant results; reanalysis of data for which two Wilcoxon tests had shown nonsignificant results showed statistically significant results. In articles that used t-tests on untransformed costs, two statistically significant results became nonsignificant on reanalysis and one nonsignificant result became statistically significant on reanalysis.

Conclusions: The methods commonly used to compare costs of two groups have limitations. Some limitations may change some conclusions, and the direction of the change cannot be predicted. The Z-score method is designed to adjust for skewness in cost data and is appropriate for comparing means of log-normally distributed cost data.


Grahic Jump Location
Figure 1.
Summary of statistical methods used in published articles.

The numbers for each type of analysis total more than the number of articles because several articles included more than one analysis. IN = natural logarithm.

Grahic Jump Location




Citing articles are presented as examples only. In non-demo SCM6 implementation, integration with CrossRef’s "Cited By" API will populate this tab (http://www.crossref.org/citedby.html).


Submit a Comment
Submit a Comment

Summary for Patients

Clinical Slide Sets

Terms of Use

The In the Clinic® slide sets are owned and copyrighted by the American College of Physicians (ACP). All text, graphics, trademarks, and other intellectual property incorporated into the slide sets remain the sole and exclusive property of the ACP. The slide sets may be used only by the person who downloads or purchases them and only for the purpose of presenting them during not-for-profit educational activities. Users may incorporate the entire slide set or selected individual slides into their own teaching presentations but may not alter the content of the slides in any way or remove the ACP copyright notice. Users may make print copies for use as hand-outs for the audience the user is personally addressing but may not otherwise reproduce or distribute the slides by any means or media, including but not limited to sending them as e-mail attachments, posting them on Internet or Intranet sites, publishing them in meeting proceedings, or making them available for sale or distribution in any unauthorized form, without the express written permission of the ACP. Unauthorized use of the In the Clinic slide sets will constitute copyright infringement.


Buy Now

to gain full access to the content and tools.

Want to Subscribe?

Learn more about subscription options

Related Articles
Topic Collections
Forgot your password?
Enter your username and email address. We'll send you a reminder to the email address on record.