Criteria for Using Risk Models in Cardiac Surgery

doi:10.1016/S1885-5857(08)60179-6

Few therapeutic interventions have been so widely researched, evaluated, and commented on as cardiac surgery, and in particular coronary revascularization. Aspects related to outcomes and the appropriateness of indications have, among many others, received much attention.

This type of research is driven by many interest groups, including heart surgeons, cardiologists, epidemiologists, health care managers, health service providers, and the media. It is one of the few specialties in which outcome audits are mooted as soon as the alarm bells sound.

Why has heart surgery been the object of such exhaustive analysis compared to other interventions? Although it is not the aim of this editorial to go into detail on this point, one of the reasons might be the interest shown by heart surgeons themselves in evaluating their own activities.

Groups performing cardiac surgery have pioneered the introduction of predictive systems to estimate the risks associated with their surgical procedures. This has led to a culture centered around "risk" and risk-assessment and it is nowadays common for heart surgeons and cardiologists to factor risk assessments into decision-making with their surgery patients. This is often done by employing risk assessment systems such as the EuroSCORE, Parsonnet, or STS.1-5 The 2004 AHA/ACC clinical guidelines6 suggest that preoperative predictive systems to assess risk can help both clinicians and patients to understand the risk-benefit equation for the procedure concerned (class IIa recommendation, level of evidence C).

Apparently, no equivalent recommendation exists for other procedures such as coronary revascularization using endoluminal procedures. The lack of such a recommendation cannot be ascribed, however, to an absence of extensively validated, predictive models for percutaneous coronary procedures.7,8 I believe it would be desirable for such an approach to be extended to other therapeutic procedures, and not only in the area of heart disease. Nevertheless, it is clear that the culture required for the application and generalization of this methodology is lacking.

It might be pretentious to try to establish, here, recommendations for the use of this type of tool. But the importance of the conclusions derived from their application suggests that we need to follow strict criteria when applying these tools in order to guarantee the validity of our conclusions. If strict criteria are not followed, erroneous interpretations may result which, instead of helping to clarify things, will lead to greater confusion. The conditions for applying these tools should be even stricter when they are used in quality assessment.

Predictive models are used at 3 different levels: individual level estimations, estimations in specific disease groups, and estimations in complete series. The latter reflect the activity of a given surgical department or unit.

Although use at the first 2 levels is acceptable, and we'll comment on these later, these statistical tools were principally developed to monitor activity in complete series.

The final aim in developing this type of model has always been to provide a tool which would allow objective evaluations and comparisons of outcomes and case-mix in populations of heart surgery patients. Instruments such as these are required as part of the philosophy of health care quality assessment, particularly when the current hypothesis regarding the assessment of care infers that if we can ensure comparable populations, any significant differences in outcome will be due to differences in the quality of care. An objection to this point of view could be that structural differences between groups are not taken into account. Unfortunately, at present there are no models available which adjust for this important dimension.

The publication of the article entitled "Validation of the EuroSCORE Probabilistic Model in Patients Undergoing Coronary Bypass Grafting"9 raises several issues related to the application of this type of tool. As mentioned above, the use of scores or predictive models has become routine in many heart surgery departments. Nevertheless, there may still be sufficient confusion surrounding their use to consider it is not yet sufficiently standardized. In the hope of clarifying some issues related with the methodology, I would like to comment on the following aspects:

- The populations in which the models are applied
- The issue of whether the logistic or additive EuroSCORE is more appropriate
- Model application in administrative databases compared to clinical databases
- What it means to validate these models
- Interpreting the results of applying the models

Although the following comments could apply to any of the predictive tools currently available, we will generally refer to the EuroSCORE because it is the most widely used in Spain.

Populations in Which Models Are Applied

Currently, these systems are being used to estimate individual risk, to estimate risk in a given disease group, and to estimate risk in complete series of surgical patients.

These models are usually constructed using logistic regression techniques to identify factors with the greatest impact on the dependent variable, in this case hospital mortality. Although multivariable analysis applied in large populations can identify many factors associated with hospital mortality, when constructing the models only a limited number of factors are generally used. There are several reasons for this, including the fact that, beyond a certain point, goodness of fit and discriminatory capacity are not usually greatly improved by the addition of more factors. Models also generally need to be simple. Other reasons are purely methodological. Harrel et al10 concluded that, in models based on logistic regression, there should be at least 10 events (deaths) per prognostic variable in the model. As the "event" to be predicted in these models is usually death, it is difficult to always meet this criterion, even in large populations.

As regards their use for individual level estimations of risk, given the limited number of variables in the EuroSCORE model, it can only be considered a rough guide at this level. This is particularly true if the patient presents a clear risk factor for death which is not included in the model. Nevertheless, for the majority of patients scheduled for heart surgery, such as those with isolated coronary or valve disease, or a combination of these, the logistic EuroSCORE is useful in identifying the likelihood of risk in a given case. At individual level, it would be desirable to have a model which takes many variables into account; however, the only model which currently incorporates a high number of variables is the STS model. It should be remembered that indicating or rejecting surgery based on estimates of high risk at individual level may not be appropriate.

Applying such models in groups of specific diseases, such as coronary disease, valve disease, etc, leads to another problem, which is different from that arising at individual level. Studies in several series have concluded that specific models are required for each disease group, particularly in the case of valve surgery patients.11-14 Of course, the ideal situation would no doubt be to have one model for each type of patient, but these specific models will need to be appropriately constructed.

It should also be made very clear in which population the model is being applied. The article by Lafuente et al does not make this clear. The title suggests that the population to be studied will be isolated coronary artery bypass patients, but then there are 4 patients with endocarditis and in 100 patients other surgical procedures were involved.

In principal, the most appropriate application for the EuroSCORE and the one for which it was created is the study of complete series of surgical patients. The basic objective in this case is to provide a standardized tool for quality control. The tool should be applied in any patient who underwent cardiac surgery using extracorporeal circulation. The only other group in which its use seems reasonable are patients who have received coronary surgery without extracorporeal circulation, as the tool was been extensively validated in this population.15,16 When used in this way, several conditions need to be strictly adhered to: all patients in which the tool can be applied should be included, all data relating to factors included in the model should be collected, and, above all, all related deaths should be recorded.

Which is the Appropriate Model: The Logistic or Additive EuroSCORE?

As mentioned previously, these tools are basically designed for use in quality assessment,17 which means they should be applicable to all patients. For this purpose, and although there are many models available, at present I believe that the EuroSCORE should continue to be used. There are many reasons for this including the fact that it has been extensively validated in different settings,18,19 the fact that Spain provided 10% of the population from which the model was developed, the fact that it is simple to use, and that it can be used in the usual population of cardiac surgery patients. The authors of the original instrument are currently planning to develop a new version of the EuroSCORE using the same methodology applied in the original study.

With respect to using the EuroSCORE, there is some confusion regarding whether the logistic or additive model should be used. It should be remembered that both models are essentially the same and that the additive model is a simplification of the logistic model, not a variant of it. The additive model assimilates and simplifies the beta coefficients used in the logistic regression equation so that it is simpler to use. Using a simple sum approach to estimate probabilities may nevertheless be a source of error, particularly in high risk patients. In order to assign a probability of death to a given score, Table should be used.1

A comparison of the additive and logistic versions of the EuroSCORE has already been performed,20 though the results generated some confusion and the conclusions derived can be easily understood by observing the Table. Leaving aside the issue of model calibration in high risk groups, which, incidentally, usually only form a small part of the overall experience of a surgical group, the discriminatory power of the 2 versions is identical.

In summary, the logistic model should ideally be used in order to avoid confusion. If the additive model is used, particularly for individual level estimates of risk, the risk of dying associated with each score segment should be taken into account.

Use With Administrative or Clinical Databases

Another interesting aspect is the use of predictive models with administrative compared to clinical databases. The majority of existing models have been developed using the latter, though they are increasingly used with administrative databases. At present, these databases do not allow for a rigorous application of predictive tools. This means that any conclusions drawn from such studies should be treated with caution.

Although the managers of administrative databases frequently claim that they are highly reliable, the reality is that, at present, they may not be sufficiently reliable. There are many reasons for this, notably:

- Administrative databases usually record data retrospectively. This implies relatively high levels of missing data which in turn may require potentially unacceptable assumptions to be made.21,22 Prospective data collection is clearly the ideal here
- Patients are often wrongly classified which means it is difficult to adequately monitor the pathology in question 23,24
- The lack of standardized end-points is another source of error.25,26 The standard definition of hospital mortality should be applied, ie, mortality occurring during hospitalization or in the first 30 days after the intervention. It is not acceptable to state that mortality outside of the hospital stay was minimal based only on subjective beliefs

Obviously, it would be ideal if there were no discrepancies due to the type of database used. More desirable again would be the existence of a single information management tool whose users would apply standardized, shared criteria when exploiting the data.

Independently of the tool used, the value of any conclusions reached will depend on the external validity of the data and the data management system.

What Does it Mean to Validate a Predictive Tool?

The word "validity" may well have been frequently misapplied when referring to this type of system. It is clearly wrong to claim to be validating a model simply because you use it in a particular set of circumstances. The validation of a probabilistic model requires compliance with a series of conditions which are only infrequently met during the application of such models.27 From a statistical point of view, validating a prognostic model means showing that it functions correctly in a population other than that in which it was developed.

The first condition to be met when validating a model is to have an adequate sample size. Some authors28 suggest that at least 100 events (deaths) are required. This implies that, if mortality is around 5%, a sample of 2000 patients will be needed.

As well as sample size, the score must be strictly applied to all patients, without assumptions having to be made because of missing data and, in particular, without missing data on the event which represents the dependent variable.

Finally, when a system has been so extensively validated as the EuroSCORE, an inability to validate the model may suggest 1 of 2 things. On the one hand, the tool may not have been correctly applied or, what is worse, the quality of care in the series studied is poor.

Interpretation of Results

If the model used has been appropriately validated and correctly applied, the interpretation can only be related to the quality of care, with the single proviso mentioned earlier that the model does not take into account structural differences.

The only interpretation available if results are significantly above or below those expected is that the quality of care is either below or above par.

It is not only licit, but obligatory to carry out a statistical comparison of mortality observed in a given context with average observed mortality. These models were developed specifically with this in mind, and it is the way they are currently used.29 Avoiding this type of comparison, as suggested by Lafuente et al in their article, is to veer from the essence of this type of methodology.

My belief is that scientific societies should establish precise guidelines regarding the use of these tools. I also believe that groups performing procedures which are susceptible to this type of analysis should incorporate predictive models into their daily practice. It is clear that inappropriate or incorrect use of these tools can lead to erroneous conclusions, an issue of particular sensitivity when quality assessment is involved.

SEE ARTICLE ON PAGES 589-94

Correspondence:
Dr. J.M. Cortina Romero.
Servicio de Cirugía Cardiaca. Hospital 12 de Octubre. Avda. Córdoba, s/n. 28041 Madrid. España.
E-mail: jcortina.hdoc@salud.madrid.org

Bibliography

[1]

Nashef SA, Roques F, Michel P, Gauducheau E, Lemeshow S, Salamon R..

European system for cardiac operative risk evaluation (EuroSCORE)..

Eur J Cardiothorac Surg, (1999), 16 pp. 9-13

Medline

[2]

Parsonnet V, Dean D, Bernstein AD..

A method of uniform stratification of risk for evaluating the results of surgery in acquired adult heart disease. Circulatio, (1989), 79 pp. 13-12

[3]

Bernstein AD, Parsonnet V..

Bedside estimation of risk as an aid for decision-making in cardiac surgery..

Ann Thorac Surg, (2000), 69 pp. 823-8

Medline

[4]

Shroyer AL, Grover FL, Edwards FH..

1995 coronary artery bypass risk model: The Society of Thoracic Surgeons Adult Cardiac National Database..

Ann Thorac Surg, (1998), 65 pp. 879-84

Medline

[5]

Shroyer AL, Plomondon ME, Grover FL, Edwards FH..

The 1996 coronary artery bypass risk model: the Society of Thoracic Surgeons Adult Cardiac National Database..

Ann Thorac Surg, (1999), 67 pp. 1205-8

Medline

[6]

Eagle KA, Guyton RA, Davidoff R, Edwards FH, Ewy GA, Gardner TJ, et al..

Guidelines for Coronary Artery Bypass Graft Surgery). Circulatio, (2004), 110 pp. e340-437

[7]

External validation of established risk adjustment models for procedural complications after percutaneous coronary intervention. 1. Heart. 2007 Nov 21 [Epub ahead of print].

[8]

Moscucci M, O'Connor GT, Ellis SG, Malenka DJ, Sievers J, Bates ER, et al..

Validation of risk adjustment models for in-hospital percutaneous transluminal coronary angioplasty mortality on an independent data set..

J Am Coll Cardiol, (1999), 34 pp. 692-7

Medline

[9]

Lafuente S, Trilla A, Bruni L, González R, Bertrán MJ, Pomar JL, et al..

Validación del modelo probabilístico EuroSCORE en pacientes intervenidos de injerto coronario..

Rev Esp Cardiol, (2008), 61 pp. 589-94

Medline

[10]

Harrell FE, Lee KL, Califf RM, Pryor DB, Rosati RA..

Regression modelling strategies for improved prognostic modelling..

Stat Med, (1984), 3 pp. 143-52

Medline

[11]

Ambler G, Omar RZ, Royston P, Kinsman R, Keogh BE, Taylor KM..

Generic, simple risk stratification model for heart valve surgery..

Circulation, (2005), 112 pp. 224-31

http://dx.doi.org/10.1161/CIRCULATIONAHA.104.515049 | Medline

[12]

Grossi EA, Schwartz CF, Yu PJ, Jorde UP, Crooke GA, Grau JB, et al..

High-risk aortic valve replacement:are the outcomes as bad as predicted? Ann Thorac Surg, (2008), 85 pp. 102-6

http://dx.doi.org/10.1016/j.athoracsur.2007.05.010 | Medline

[13]

Jin R, Grunkemeier GL, Starr A..

Validation and refinement of mortality risk models for heart valve surgery..

Ann Thorac Surg, (2005), 80 pp. 471-9

http://dx.doi.org/10.1016/j.athoracsur.2005.02.066 | Medline

[14]

van Gameren M, Kappetein AP, Steyerberg EW, Venema AC, Berenschot EA, Hannan EL, et al..

Do we need separate risk stratification models for hospital mortality after heart valve surgery? Ann Thorac Surg, (2008), 85 pp. 921-30

[15]

Vázquez Roque FJ, Fernández TR, Pita S, Cuenca JJ, Herrera JM, Campos V, et al..

Evaluación preoperatoria del riesgo en la cirugía coronaria sin circulación extracorpórea..

Rev Esp Cardiol, (2005), 58 pp. 1302-9

Medline

[16]

Wu Y, Grunkemeier GL, Handy JR Jr..

Coronary artery bypass grafting:are risk models developed from on-pump surgery valid for off-pump surgery? J Thorac Cardiovasc Surg, (2004), 127 pp. 174-8

[17]

Díaz de Tuesta I, Cuenca J, Fresneda PC, Calleja M, Llorens R, Aldámiz G, et al..

No hay relación entre el volumen quirúrgico y la mortalidad en los servicios de cirugía cardiaca en España..

Rev Esp Cardiol, (2008), 61 pp. 276-82

Medline

[18]

Nashef SA, Roques F, Hammill BG, Peterson ED, Michel P, Grover FL, et al..

Validation of European System for Cardiac Operative Risk Evaluation (EuroSCORE) in North American cardiac surgery..

Eur J Cardiothorac Surg, (2002), 22 pp. 101-5

Medline

[19]

Roques F, Nashef SA, Michel P, Pinna PP, David M, Baudet E, et al..

Does EuroSCORE work in individual European countries? Eur J Cardiothorac Surg, (2000), 18 pp. 27-30

[20]

Michel P, Roques F, Nashef SA..

Logistic or additive EuroSCORE for high-risk patients? Eur J Cardiothorac Surg, (2003), 23 pp. 684-7

http://dx.doi.org/10.1136/emj.2006.035220 | Medline

[21]

Herbert MA, Prince SL, Williams JL, Magee MJ, Mack MJ..

Are unaudited records from an outcomes registry database accurate? Ann Thorac Surg, (2004), 77 pp. 1960-4

[22]

Mack MJ, Herbert M, Prince S, Dewey TM, Magee MJ, Edgerton JR..

Does reporting of coronary artery bypass grafting from administrative databases accurately reflect actual clinical outcomes? J Thorac Cardiovasc Surg, (2005), 129 pp. 1309-17

[23]

Glance LG, Dick AW, Osler TM, Mukamel DB..

Accuracy of hospital report cards based on administrative data..

Health Serv Res, (2006), 41 pp. 1413-37

http://dx.doi.org/10.1111/j.1475-6773.2006.00554.x | Medline

[24]

Shahian DM, Silverstein T, Lovett AF, Wolf RE, Normand SL..

Comparison of clinical and administrative data sources for hospital coronary artery bypass graft surgery report cards..

Circulation, (2007), 115 pp. 1518-27

http://dx.doi.org/10.1161/CIRCULATIONAHA.106.633008 | Medline

[25]

Hannan EL, Racz MJ, Jollis JG, Peterson ED..

Using Medicare claims data to assess provider quality for CABG surgery:does it work well enough? Health Serv Res, (1997), 31 pp. 659-78

[26]

Parker JP, Li Z, Damberg CL, Danielsen B, Carlisle DM..

Administrative versus clinical data for coronary artery bypass graft surgery report cards: the view from California..

Med Care, (2006), 44 pp. 687-95

http://dx.doi.org/10.1097/01.mlr.0000215815.70506.b6 | Medline

[27]

Altman DG, Royston P..

What do we mean by validating a prognostic model? Stat Med, (2000), 19 pp. 453-73

[28]

Regression models strategies. New York: Springer-Verlag; 2001.

[29]

Ribera A, Ferreira-González I, Cascant P, Pons JM, Permanyer-Miralda G..

Evaluación de la mortalidad hospitalaria ajustada al riesgo de la cirugía coronaria en la sanidad pública catalana. Influencia del tipo de gestión del centro (estudio ARCA)..

Rev Esp Cardiol, (2006), 59 pp. 431-40

Medline

REVISTA ESPAÑOLA DE

CARDIOLOGÍA

Criteria for Using Risk Models in Cardiac Surgery

Condiciones de aplicación de modelos de riesgo en cirugía cardiaca

Options

Year/month	Html	Pdf	Total
2025 July	24	6	30
2025 June	49	13	62
2025 May	49	27	76
2025 April	36	16	52
2025 March	47	19	66
2025 February	44	31	75
2025 January	45	29	74
2024 December	66	30	96
2024 November	40	33	73
2024 October	42	43	85
2024 September	73	4	77
2024 August	32	38	70
2024 July	26	17	43
2024 June	17	15	32
2024 May	18	13	31
2024 April	33	30	63
2024 March	31	10	41
2024 February	24	11	35
2024 January	39	20	59
2023 December	51	8	59
2023 November	41	4	45
2023 October	45	14	59
2023 September	43	14	57
2023 August	37	8	45
2023 July	84	13	97
2023 June	71	9	80
2023 May	73	9	82
2023 April	65	16	81
2023 March	60	9	69
2023 February	106	17	123
2023 January	48	19	67
2022 December	55	21	76
2022 November	41	18	59
2022 October	32	25	57
2022 September	42	17	59
2022 August	95	21	116
2022 July	89	26	115
2022 June	34	23	57
2022 May	34	19	53
2022 April	44	29	73
2022 March	44	48	92
2022 February	43	33	76
2022 January	49	23	72
2021 December	28	25	53
2021 November	33	23	56
2021 October	44	17	61
2021 September	32	30	62
2021 August	37	30	67
2021 July	37	21	58
2021 June	47	19	66
2021 May	58	26	84
2021 April	78	42	120
2021 March	102	25	127
2021 February	80	9	89
2021 January	58	6	64
2020 December	54	15	69
2020 November	19	17	36
2020 October	31	16	47
2020 September	28	7	35
2020 August	23	15	38
2020 July	25	12	37
2020 June	27	17	44
2020 May	36	12	48
2020 April	27	12	39
2020 March	35	11	46
2020 February	38	8	46
2020 January	29	9	38
2019 December	47	16	63
2019 November	24	11	35
2019 October	26	9	35
2019 September	28	12	40
2019 August	30	25	55
2019 July	57	59	116
2019 June	72	65	137
2019 May	58	44	102
2019 April	35	20	55
2019 March	33	18	51
2019 February	50	21	71
2019 January	60	16	76
2018 December	75	16	91
2018 November	58	12	70
2018 October	61	6	67
2018 September	30	8	38
2018 August	28	9	37
2018 July	62	15	77
2018 June	61	9	70
2018 May	47	12	59
2018 April	34	4	38
2018 March	46	7	53
2018 February	50	5	55
2018 January	48	4	52
2017 December	28	7	35
2017 November	29	7	36
2017 October	21	8	29
2017 September	22	5	27
2017 August	14	10	24
2017 July	14	6	20
2017 June	43	12	55
2017 May	25	8	33
2017 April	12	3	15
2017 March	35	3	38
2017 February	30	6	36
2017 January	55	5	60
2016 December	32	8	40
2016 November	33	7	40
2016 October	76	4	80
2016 September	155	5	160
2016 August	47	8	55
2016 July	58	10	68
2016 June	54	15	69
2016 May	40	18	58
2016 April	27	38	65
2016 March	42	23	65
2016 February	44	28	72
2016 January	58	33	91
2015 December	50	16	66
2015 November	51	16	67
2015 October	53	14	67
2015 September	52	16	68
2015 August	59	21	80
2015 July	44	13	57
2015 June	32	6	38
2015 May	79	13	92
2015 April	60	5	65
2015 March	54	10	64
2015 February	47	17	64
2015 January	33	9	42
2014 December	32	7	39
2014 November	19	8	27
2014 October	34	6	40
2014 September	45	8	53
2014 August	29	6	35
2014 July	21	6	27
2014 June	45	8	53
2014 May	33	7	40
2014 April	36	11	47
2014 March	56	10	66
2014 February	59	5	64
2014 January	56	8	64
2013 December	65	14	79
2013 November	63	11	74
2013 October	49	14	63
2013 September	63	22	85
2013 August	52	32	84
2013 July	55	33	88
2013 June	38	33	71
2013 May	45	30	75
2013 April	30	26	56
2013 March	30	20	50
2013 February	34	8	42
2013 January	31	4	35
2012 December	21	7	28
2012 November	17	9	26
2012 October	6	3	9
2012 September	889	0	889