The full content of Annals is available to subscribers

Subscribe/Learn More  >
Academia and the Profession |

Advances in Measuring the Effect of Individual Predictors of Cardiovascular Risk: The Role of Reclassification Measures

Nancy R. Cook, ScD; and Paul M Ridker, MD
[+] Article and Author Information

From the Donald W. Reynolds Center for Cardiovascular Research and the Center for Cardiovascular Disease Prevention, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts.

Grant Support: By the Donald W. Reynolds Foundation (Dr. Ridker), and the Leducq Foundation (Dr. Ridker). The overall Women's Health Study is supported by grants from the National Heart, Lung, and Blood Institute and the National Cancer Institute (HL-43851 and CA-47988).

Potential Financial Conflicts of Interest:Consultancies: P.M Ridker (AstraZeneca, Schering-Plough, Sanofi Aventis, ISIS, Siemens, Merck, Novartis, Vascular Biogenics). Grants received: P.M Ridker (National Heart, Lung, and Blood Institute, National Cancer Institute, Donald W. Reynolds Foundation, Leducq Foundation, AstraZeneca, Merck, Novartis, Abbott, Roche, Sanofi Aventis). Patents received: P.M Ridker (Brigham and Women's Hospital). Royalties: P.M Ridker (Brigham and Women's Hospital).

Requests for Single Reprints: Nancy R. Cook, ScD, Division of Preventive Medicine, Brigham and Women's Hospital, 900 Commonwealth Avenue East, Boston, MA 02215; e-mail, ncook@rics.bwh.harvard.edu.

Current Author Addresses: Drs. Cook and Ridker: Division of Preventive Medicine, Brigham and Women's Hospital, 900 Commonwealth Avenue East, Boston, MA 02215.

Author Contributions: Conception and design: N.R. Cook, P.M Ridker.

Analysis and interpretation of the data: N.R. Cook, P.M Ridker.

Drafting of the article: N.R. Cook.

Critical revision of the article for important intellectual content: N.R. Cook, P.M Ridker.

Final approval of the article: N.R. Cook, P.M Ridker.

Provision of study materials or patients: P.M Ridker.

Statistical expertise: N.R. Cook.

Administrative, technical, or logistic support: N.R. Cook, P.M Ridker.

Collection and assembly of data: P.M Ridker.

Ann Intern Med. 2009;150(11):795-802. doi:10.7326/0003-4819-150-11-200906020-00007
Text Size: A A A

Models for risk prediction are widely used in clinical practice to stratify risk and assign treatment strategies. The contribution of new biomarkers has largely been based on the area under the receiver-operating characteristic curve, but this measure can be insensitive to important changes in absolute risk. Methods based on risk stratification have recently been proposed to compare predictive models. Such methods include the reclassification calibration statistic, the net reclassification improvement, and the integrated discrimination improvement. This article demonstrates the use of reclassification measures and illustrates their performance for well-known cardiovascular risk predictors in a cohort of women. These measures are targeted at evaluating the potential of new models and markers to change risk strata and alter treatment decisions.


Grahic Jump Location
Figure 1.
Reclassification table comparing 10-year risk strata for models that include risk factors for cardiovascular disease in the Women's Health Study with and without SBP.

Red shading indicates an increase in risk category; and blue shading indicates a decrease in risk category. Total reclassified = 2022 (8.23%); total reclassified in cells with at least 20 observations = 2009. SBP = systolic blood pressure. * Case patients and control participants at 8 years of follow-up, ignoring censored observations. † Observed risk at 10 years is estimated from Kaplan–Meier curve by using observations within each cell. The reclassification calibration statistic compares the observed risk with the average predicted risk within each cell. The chi-square statistic for the model without SBP is 68.3 (P < 0.001); for the model with SBP, chi-square is 22.9 (P = 0.006). Reclassification improvement is 10.5% among case patients (99 − 40 of 560), whereas classification worsened in control participants by 0.7% (821 − 992 of 23 611), leading to a net reclassification improvement of 9.8%.

Grahic Jump Location
Grahic Jump Location
Figure 2.
Plot of predicted 10-year risk from models including cardiovascular risk factors but with and without systolic blood pressure in the Women's Health Study.

The dashed diagonal line is the line of unity; horizontal and vertical lines represent risk strata cut-points. SBP = systolic blood pressure.

Grahic Jump Location




Citing articles are presented as examples only. In non-demo SCM6 implementation, integration with CrossRef’s "Cited By" API will populate this tab (http://www.crossref.org/citedby.html).


Submit a Comment
No Title
Posted on June 17, 2009
Ralph H Stern
University of Michigan
Conflict of Interest: None Declared

To the Editor:

Readers of Cook and Ridker's paper should be given an opportunity to look at the unprocessed Women's Health Study data, not just the statistical analysis. Since their data appears to be well described by a lognormal distribution, which is expected when risk factors interact multiplicatively (1), a model of their data can be created. Continuous risk distribution curves for the Reynolds Risk Score and Reynolds Risk Score without systolic blood pressure fitted to their categorical data are shown in the Figure (available at http://s668.photobucket.com/albums/vv49/sngoonew/). There appears to be little difference between the two distributions, consistent with the minimal difference in ROC curve AUC's (2). I would ask Drs. Cook and Ridker to provide a similar graph based on the actual data so readers could judge whether the differences in risk stratification are likely to be clinically significant.

It is difficult to understand how reclassification analysis could tell us anything about accuracy that couldn't be learned from studying each model alone. The only additional information included is the correlation between individual risk estimates, which is best evaluated with a scatterplot (avoiding the categorization of continuous data). When multivariate models differ by a single risk factor, as in their example, discordance between individual risk estimates may be modest. However when multivariate models share few risk factors, the discordance can be substantial (3). It is known that the greater the discordance, the higher the reclassification rate (4). Discordance reflects the fact that different models assign different risk estimates to the same individual and accounts for almost all of the reclassification. Differences in accuracy are not required to generate reclassification and, when they exist, are best evaluated by specific measures of calibration.


1. Limpert E, Stahel WA, Abbt M. Log-normal Distributions across the Sciences: Keys and Clues. BioScience. 2001;51:341-352.

2. Stern RH. Evaluating New Cardiovascular Risk Factors for Risk Stratification. J Clin Hyper 2008;10:485-488.

3. Lemeshow S, Klar J, Teres D. Outcome prediction for individual intensive care patients: useful, misused, or abused? Intensive Care Med 1995;21:770-776.

4. Janes H, Pepe MS, Gu W. Assessing the Value of Risk Predictions by Using Risk Stratification Tables. Ann Intern Med 2008;149:751-760.

Conflict of Interest:

None declared

Reclassification calculations with incomplete follow-up
Posted on June 22, 2009
Ewout W Steyerberg
Dept of Public Health, Erasmus MC, Rotterdam, the Netherlands
Conflict of Interest: None Declared
Cook and Ridker are to be applauded for their clear discussion of reclassification measures, and the modern developments in the area of judging the incremental value of a biomarker for prediction of outcome (1). A key area of application is in cardiovascular disease, where the time horizon is typically 10 years. One important problem recognized by Cook and Ridker is that not all subjects will have follow-up completed until 10 years. Kaplan-Meier curves and Cox regression analysis have been introduced long ago to deal with such censored observations. Reclassification measures, such as the net reclassification index (NRI) (2), have been proposed for binary data and currently do not have a way of incorporating incomplete follow-up. As with other model performance measures in survival analysis, reclassification statistics can be estimated at different time points within the follow-up window. To address the issue of censored data, Cook and Ridker propose to select only subjects with follow-up complete at a certain time point, 8 years in their example. They were able to include the majority of control participants, since 23,611 of 23,792 women had follow-up of at least 8 years, excluding only 181, or 1%. But only 560 of 766 cases had a cardiovascular event before 8 years of follow-up, leading to exclusion of 206 or 27%. We suggest a simple alternative based on the expected number of cases and non-cases calculated using the Kaplan-Meier estimator. This approach was recently found optimal in assessing calibration of survival models (3). It appropriately handles censored data, and does not throw away useful information. We provide a revised Figure 1 created with our proposal, with cell entries for cases and non-cases obtained by multiplying the 10 year Kaplan-Meier rates by the total numbers of people in each cell at 10 years given in the original table. We then expect 697 cases at 10 years of follow-up, and 23,861 control participants. The reclassification numbers change to some extent. Although the conclusions remain largely the same in this example (NRI 9.9% vs 9.8% originally), we would like to recommend our simple estimation procedure of the NRI for future application with censored observations. Especially when more censoring occurs early during follow- up, our approach is attractive. In this case, choosing one time point for analysis can lead to exclusion of many control participants, or relatively many cases, making the NRI estimate quite unstable. Some specific issues, such as bias and precision, require further research. We note that the asymptotic confidence interval for NRI calculated using the approach outlined in (2) is no longer valid for the current extension. A practical solution would use bootstrap estimation (4), in addition to its use for bias correction as already correctly suggested by Cook and Ridker. Revised Fig 1. Reclassification table comparing 10-year risk strata for models that include risk factors for cardiovascular disease in the Women's Health Study with and without SBP, using the 10-year Kaplan-Meier estimates to estimate the number of case patients and control participants


1. Cook NR, Ridker PM. Advances in measuring the effect of individual predictors of cardiovascular risk: the role of reclassification measures. Ann Intern Med. 2009;150(11):795-802.

2. Pencina MJ, D'Agostino RB, Sr., D'Agostino RB, Jr., Vasan RS. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med. 2008;27(2):157-72; discussion 207-12.

3. Viallon V, Ragusa S, Clavel-Chapelon F, Benichou J. How to evaluate the calibration of a disease risk prediction tool. Stat Med. 2009;28(6):901-16.

4. Pepe MS, Feng Z, Gu JW. Comments on "˜Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond'. Stat Med. 2008;27(2):173"“181.

Conflict of Interest:

None declared

Author reply
Posted on July 24, 2009
Nancy R. Cook
Brigham and Women's Hospital
Conflict of Interest: None Declared

We thank Drs. Steyerberg and Pencina for bring up an important point with regard to evaluating reclassification measures in the presence of survival data. When the outcome is time to an event, such as a cardiovascular event, care needs to be taken to accommodate censoring. The reclassification calibration statistic can easily be calculated using survival data, as indicated in our paper. The Kaplan-Meier estimate of the event rate as of 10 years, for example, can be used to obtain the expected number of events within each cell of the reclassification table. D'Agostino and Nam(1) suggest that with survival data, the degrees of freedom should be k-1 rather than k-2, where k in the setting of reclassification is the number of cells containing at least 20 individuals.

The use of survival data is more problematic for the NRI and IDI, which both condition on case-control status. A similar problem occurs for the c-statistic, but methods to accommodate survival data have been established (2). For the NRI, Steyerberg and Pencina propose using the expected number of cases based on the Kaplan-Meier estimate within each cell, the same calculation needed for the reclassification calibration statistic. While an estimated standard error is not currently available for this measure, a confidence interval as well as the standard error can be determined using bootstrap samples.

We suggest that both the reclassification calibration statistic and the NRI be computed for reclassification tables, even in the presence of survival data.


1. D'Agostino RB, Nam B-H. Evaluation of the performance of survival analysis models: Discrimination and calibration measures. Handbook of Statistics. Vol. 23, pp1-25.

2. Harrell FE, Jr. Regression Modeling Strategies. New York: Springer, 2001.

Conflict of Interest:

None declared

Submit a Comment

Summary for Patients

Clinical Slide Sets

Terms of Use

The In the Clinic® slide sets are owned and copyrighted by the American College of Physicians (ACP). All text, graphics, trademarks, and other intellectual property incorporated into the slide sets remain the sole and exclusive property of the ACP. The slide sets may be used only by the person who downloads or purchases them and only for the purpose of presenting them during not-for-profit educational activities. Users may incorporate the entire slide set or selected individual slides into their own teaching presentations but may not alter the content of the slides in any way or remove the ACP copyright notice. Users may make print copies for use as hand-outs for the audience the user is personally addressing but may not otherwise reproduce or distribute the slides by any means or media, including but not limited to sending them as e-mail attachments, posting them on Internet or Intranet sites, publishing them in meeting proceedings, or making them available for sale or distribution in any unauthorized form, without the express written permission of the ACP. Unauthorized use of the In the Clinic slide sets will constitute copyright infringement.


Buy Now

to gain full access to the content and tools.

Want to Subscribe?

Learn more about subscription options

Related Articles
Related Point of Care
Topic Collections
PubMed Articles

Buy Now

to gain full access to the content and tools.

Want to Subscribe?

Learn more about subscription options

Forgot your password?
Enter your username and email address. We'll send you a reminder to the email address on record.