The full content of Annals is available to subscribers

Subscribe/Learn More  >
Academia and the Profession |

Assessing the Value of Risk Predictions by Using Risk Stratification Tables

Holly Janes, PhD; Margaret S. Pepe, PhD; and Wen Gu, MS
[+] Article, Author, and Disclosure Information

From the Fred Hutchinson Cancer Research Center and the University of Washington, Seattle, Washington.

Acknowledgment: The authors thank Patrick Bossuyt for very helpful discussions of the material.

Potential Financial Conflicts of Interest: None disclosed.

Requests for Single Reprints: Holly Janes, PhD, Fred Hutchinson Cancer Research Center; 1100 Fairview Avenue North M2-C200, Seattle, WA 98109; e-mail, hjanes@scharp.org.

Current Author Addresses: Drs. Janes and Pepe: Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, Seattle, WA 98109.

Dr. Gu: Department of Biostatistics, University of Washington, Box 357232, Seattle, WA 98195.

Ann Intern Med. 2008;149(10):751-760. doi:10.7326/0003-4819-149-10-200811180-00009
Text Size: A A A

The recent epidemiologic and clinical literature is filled with studies evaluating statistical models for predicting disease or some other adverse event. Risk stratification tables are a new way to evaluate the benefit of adding a new risk marker to a risk prediction model that includes an established set of markers. This approach involves cross-tabulating risk predictions from models with and without the new marker. In this article, the authors use examples to show how risk stratification tables can be used to compare 3 important measures of model performance between the models with and those without the new marker: the extent to which the risks calculated from the models reflect the actual fraction of persons in the population with events (calibration); the proportions in which the population is stratified into clinically relevant risk categories (stratification capacity); and the extent to which participants with events are assigned to high-risk categories and those without events are assigned to low-risk categories (classification accuracy). They detail common misinterpretations and misuses of the risk stratification method and conclude that the information that can be extracted from risk stratification tables is an enormous improvement over commonly reported measures of risk prediction model performance (for example, c-statistics and HosmerLemeshow tests) because it describes the value of the models for guiding medical decisions.





Citing articles are presented as examples only. In non-demo SCM6 implementation, integration with CrossRef’s "Cited By" API will populate this tab (http://www.crossref.org/citedby.html).


Submit a Comment/Letter
Comments on Risk Stratification Table Paper
Posted on December 11, 2008
Ralph H. Stern
University of Michigan
Conflict of Interest: None Declared

The article by by Janes, Pepe, and Gu advocates using risk stratification tables, not using the ROC curve AUC or c-statistic, and valuing models that assign a wide range of risks to individuals assigned a narrow range of risks by another model. A recent review of this topic reached different conclusions (1).

Simply providing a risk distribution curve (frequency vs risk in the population) is a more informative way to present the results of a risk stratification model, (1) while a plot of predicted versus observed risk for each decile of risk is a more informative way to present an assessment of a model's calibration.

By choosing only one way to view the ROC curve, the authors are lead to reject the ROC curve AUC or c-statistic as a clinically important measure of risk stratification. The authors correctly point out that better models place more participants at the extremes of the risk distribution curve. But it is known that the risk distribution curve determines the ROC curve (2) and the ROC curve AUC is a measure of the dispersion of the risk distribution curve (1,3). From this perspective, the ROC curve AUC is a valid measure of risk stratification.

The authors make an important contribution by demonstrating that redistribution in risk stratification tables results from lack of correlation between the risks calculated from two different models. But this calls into question the current interpretation of redistribution, which is that identification of individuals at low, medium, and high risk by one method from a subgroup of individuals identified at medium risk by a different method proves superior risk stratification. When the methods are equivalent (as shown by the margins of the risk stratification table or measures of calibration and discrimination), this redistribution merely reflects the fact that different methods provide different risk estimates for the same individual. When the models differ by a single risk factor, high correlation between estimates may be observed (4). However when models differ by many risk factors, the lack of correlation between estimates can be dramatic (5).


1. Stern RH. Evaluating New Cardiovascular Risk Factors for Risk Stratification. J Clin Hypertension. 2008;10:485-488.

2. Diamond GA. What price perfection? Calibration and Discrimination of Clinical Prediction Models. J Clin Epidemiol. 1992;45:85-89.

3. Cook NR. Use and Misuse of the Receiver Operating Characteristic Curve in Risk Prediction. Circulation. 2007;115:928-935.

4. McGeechan KM, Macaskill P, Irwig L, Liew G, Wong TY. Assessing New Biomarkers and Predictive Models for Use in Clinical Practice A Clinician's Guide. Arch Int Med. 2008; 168:2304-2310.

5. Lemeshow S, Klar J, Teres D. Outcome prediction for individual intensive care patients: useful, misused, or abused? Intensive Care Med. 1995;21:770-776.

Conflict of Interest:

None declared

Response to Dr. Stern: "Assessing the Value of Risk Predictions by Using Risk-Stratification Tables
Posted on January 14, 2009
Holly Janes
No Affiliation
Conflict of Interest: None Declared

We thank Dr. Stern for his thoughts on improving methods for the evaluation of risk prediction models.

We agree wholeheartedly that the distribution of risks predicted by the risk prediction model is key for evaluating model performance. This has been called the predictiveness curve in the statistical literature and we have advocated strongly for its use (Pepe et al 2008; Huang, Pepe and Feng 2007). In fact the margins of a risk stratification table display exactly this, they show the population distribution of risk according to the two models, albeit using discrete categories. Since the main goal of our paper is to emphasize that one should focus on the margins of the risk stratification table rather than on the interior cells, our paper in fact concurs with Dr Stern's point of view.

The AUC or c-statistic can indeed be viewed as a measure of the dispersion of the risk distribution. However, it seems to be a measure that lacks clinical relevance (Pepe, Janes and Gu, 2007; Pepe and Janes, 2008). In addition, dissatisfaction with the ROC curve stems in part from the fact that risk thresholds are not displayed by it. We advocate instead displaying risk distributions for events and non-events separately as a way to directly view true positive rates (for events) and false positive rates (for non-events) associated with specific risk thresholds (Pepe et al, 2008). Although mathematically equivalent to reporting the ROC curve and the overall event rate (Huang and Pepe, in press), the risk distributions are much easier to interpret. Again, the margins of the risk stratification table show these distributions in categories.

We demonstrate that the amount of reclassification shown in a risk stratification table is simply a consequence of the extent of correlation between the risks calculated from the two models. Knowing the correlation in risks between two models is of little use; rather, the calibration, capacity for risk stratification, and classification accuracy should be used as metrics for model comparison, all of which can be viewed from the margins of the risk stratification table. When risk categories are not defined in advance, we agree with Dr. Stern that plots can be used to display this information (Pepe, Feng and Gu, 2008).


1. Pepe MS, Janes H, Gu JW Letter to the editor regarding "Special Report, Use and Misuse of the Receiver Operating Characteristic Curve in Risk Prediction" Circulation 116: e132, 2007.

2. Huang Y, Pepe MS, and Feng Z. Evaluating the predictiveness of a continuous marker. Biometrics.63: 1181-1188, 2007.

3. Pepe MS, Feng Z, Gu JW. Invited commentary on "˜Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond.' Statistics in Medicine. 27:173"“181, 2008.

4. Pepe MS, Feng Z, Huang Y, Longton G, Prentice R, Thompson IM, Zheng Y. Integrating the predictiveness of a marker with its performance as a classifier. American Journal of Epidemiology 167(3):362-368, 2008.

5. Pepe MS, Janes HE Gauging the performance of SNPs, biomarkers and clinical factors for predicting risk of breast cancer Journal of the National Cancer Institute 100(14): 978-9, 2008.

6. Huang Y, Pepe MS A parametric ROC model based approach for evaluating the predictiveness of continuous markers in case-control studies Biometrics (in press).

Conflict of Interest:

None declared

Submit a Comment/Letter

Summary for Patients

Clinical Slide Sets

Terms of Use

The In the Clinic® slide sets are owned and copyrighted by the American College of Physicians (ACP). All text, graphics, trademarks, and other intellectual property incorporated into the slide sets remain the sole and exclusive property of the ACP. The slide sets may be used only by the person who downloads or purchases them and only for the purpose of presenting them during not-for-profit educational activities. Users may incorporate the entire slide set or selected individual slides into their own teaching presentations but may not alter the content of the slides in any way or remove the ACP copyright notice. Users may make print copies for use as hand-outs for the audience the user is personally addressing but may not otherwise reproduce or distribute the slides by any means or media, including but not limited to sending them as e-mail attachments, posting them on Internet or Intranet sites, publishing them in meeting proceedings, or making them available for sale or distribution in any unauthorized form, without the express written permission of the ACP. Unauthorized use of the In the Clinic slide sets will constitute copyright infringement.


Buy Now for $32.00

to gain full access to the content and tools.

Want to Subscribe?

Learn more about subscription options

Related Articles
Journal Club
Topic Collections
PubMed Articles
Forgot your password?
Enter your username and email address. We'll send you a reminder to the email address on record.