Polygenic risk score as a key factor in cardiovascular clinical prediction models

doi:10.1016/j.rec.2020.01.003

In clinical practice in the context of early disease detection and prevention of common adult-onset conditions, one of the most common problems is classification or decision-making, which is carried out through diagnostic and/or prognostic tests. When seeking their optimization, it is fundamental to be aware of their accuracy and precision.

Since the completion of the Human Genome Project, the combination of large-scale genome variation projects, such as the HapMap and 1000 Genomes projects, together with low-cost robust genotyping platforms and the rapid advance of DNA sequencing technologies, has enabled genome-wide association studies (GWAS) in large cohorts and exome- and genome-wide sequencing studies. Consequently, there has been an exponential increase in the abundance of individual-specific genotype data, leading to the era of personalized medicine or precision genomics-based medicine.1

Historically, genetic diseases were classified into those with a Mendelian or simple inheritance caused by genetic variations with a large effect and those with a complex inheritance caused by the sum of genetic variations with a reduced effect. However, currently, each individual's overall risk of developing a common disease is probably marked by a combination of common low-risk genetic variants and rare high-risk genetic variants.2

GWAS have focused on identifying disease- or trait-associated genetic variants (typically single nucleotide polymorphisms [SNPs]), which are common in a given population (eg, minor allele frequency > 1%). To date, GWAS have identified thousands of loci that are associated with several complex human traits and diseases, including cardiovascular diseases.3 Notably, many of the loci previously associated with these complex human diseases are highlighted by multiple low-risk SNPs.4

These data have provided numerous insights into the genes and pathways that cause disease, but more recently there has been increasing interest in the use of these data for disease risk prediction.5,6 In the last decade, genomics-based precision medicine has consistently emerged to provide effective and tailored health care for patients, depending on their genetic background. The inclusion of genetic risk scores (GRS), including disease or phenotype associated SNPs, into risk modelling has improved the accuracy of individual disease prediction,7 as reported in an original article published by Rincón et al. in Revista Española de Cardiología.8

The main focus of the development of genetic risk models is to achieve accurate predictive power for recognizing at-risk individuals (figure 1). Most commonly these models are calculated as a weighted sum of the number of risk alleles carried by an individual, where the risk alleles and their effect sizes are defined by previous GWAS.6 Therefore, the accuracy of a GRS is marked by the efficiency of previous GWAS studies in finding genetic variants associated with common diseases. In other words, the sounder the foundations of the building—in our case the genetic associations described in the GWAS studies—the more resistant our construction will be, ie, the more accurate our risk prediction estimate will be.

Figure 1.

Risk score distribution. Distribution of the genetic risk ranges in a population according to the accumulation of risk alleles. PRS, polygenic risk score.

(0.19MB).

Predictive performance is typically evaluated by receiver operating characteristic (ROC) curves, in which the sensitivity and specificity of the predictions are ranked at various cutoff values. In the simplistic case, in which the development of any condition or disease is to be predicted, sensitivity is given as the fraction of the true-positive ratio among the total number of patients with the disease. Of note, a true positive is any patient who has the disease and has a positive result in the clinical prediction model. Therefore, the true-positive ratio is the probability of correctly classifying a patient. The specificity is the true-negative ratio among the total number of patients without the disease. Specificity is the probability of correctly classifying a healthy individual, that is to say, the probability that a healthy person will have a negative result. A ROC curve is a 2-dimensional graph in which the true-positive rate (sensitivity) is represented on the vertical axis, and the false-positive rate (1-specificity) on the horizontal axis. Therefore, a ROC graph of a prediction model represents the relative equilibrium between true positives and false negatives. The area under the ROC curve is the probability of the examined model correctly identifying a case out of a randomly chosen pair of case and control samples. Area under ROC curve results range from 0.5 (ie, random) to 1 (ie, 100% accuracy).

GRS or polygenic risk scores (PRS), as many authors now call them, rather than predict the presence or absence of a disease, aim to classify the population into different risk levels (figure 1). The threshold for considering a positive GRS depends on the balance between the risk value marked by their own cutoff values in combination with other risk factors and the benefits associated with a possible therapeutic or lifestyle intervention.

It is currently believed that the genetics of nonfamilial forms of the most common adult-onset heart diseases are mainly linked to a combination of common variants with small effect sizes distributed throughout the genome and rare variants of moderate effect in genes known to cause familial disease. Evidence of this has been described in recent comprehensive genomic studies, such as an extensive GWAS coronary artery disease study9 and a large-scale sequencing study of type 2 diabetes mellitus.10 Therefore, the effect of each of these common variants on an individual will be too small to predict risk, but the combination of many of these common variants can be used to predict risk efficiently, especially if risk is predicted in combination with classic risk factors, such as clinical risk factors or certain environmental exposures.

One of the first publications on the implementation of GRS in cardiovascular diseases was the study by Morrison et al.,11 in which the use of an 11-polymorphism score for predicting coronary heart disease risk did not improve the predictive capacity of classic risk factors. Since then, many PRS have been published and validated and are especially effective in groups of patients with highly specific phenotype. Some examples are coronary artery disease PRS aiming to individualize the decision to initiate lifetime statin therapy,12 or PRS to improve the prediction capacity of patients classified as being at intermediate risk of cardiovascular heart disease according to the Framingham scale.13

GRS applied to young adults to predict recurrent events after myocardial infarction, as described by Rincón et al.,8 should be validated in a more extensive sample, since positive results are mainly seen in young patients without diabetes, but the observation is based on a small number of patients. Studies like this one open doors to the implementation of PRS in clinical prediction models but they must always be validated and based on extensive data.

Last but not least, we must not forget the uncertainty in the estimation of the effect size associated with each common variant included in a genetic score, when the PRS is used to estimate the risk in other populations beyond the population studied in the GWAS. Since most of the GWAS were executed in European ancestry populations and genetic diversity among populations with different ancestry is well known, we must take special care when extending the applicability of PRS to all populations worldwide. Estimates are not transferable between populations, and ultimately the PRS is applied to an individual patient with a given geographical origin and with a characteristic genetic load, but with the same rights to health care.14

Although the number of studies with polygenic risk estimates has grown exponentially in the last 5 years, large-scale studies should be carried out to demonstrate the usefulness of polygenic risk estimation, not only in the cardiovascular field but also in other areas of human health. In this regard, the European action of the One Million Genomes Initiative, aiming to have this number of genomes linked to clinical data sequenced for 2022, with Spain a Signatory Member State, is perhaps the most promising project.15 Whole-genome data at this scale have the potential to make rapid progress in precision medicine and risk prediction estimates.

Funding

This work was partially suported by Plan Estatal de I+D+i 2013-2016, Subdirección General de Evaluación y Fomento de la Investigación (ISCIII-SGEFI) from Instituto de Salud Carlos III (ISCIII) and Fondo Europeo de Desarrollo Regional (FEDER) (grant numbers PI16/00903, CB16/11/00226, CB06/07/0088).

Conflicts of interest

None declared.

.

References

[1]

Z. Laksman, A.S. Detsky.

Personalized medicine: understanding probabilities and managing expectations.

J Gen Intern Med., (2011), 26 pp. 204-206

http://dx.doi.org/10.1007/s11606-010-1515-6 | Medline

[2]

N. Katsanis.

The continuum of causality in human genetic disorders.

Genome Biol., (2019), 17 pp. 233

http://dx.doi.org/10.1186/s13059-016-1107-9 | Medline

[3]

A. Buniello, J.A.L. MacArthur, M. Cerezo, et al.

The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019.

Nucleic Acids Res., (2019), 47 pp. D1005-D1012

http://dx.doi.org/10.1093/nar/gky1120 | Medline

[4]

P.M. Visscher, N.R. Wray, Q. Zhang, et al.

10 years of GWAS discovery: biology, function, and translation.

Am J Hum Genet., (2017), 101 pp. 5-22

http://dx.doi.org/10.1016/j.ajhg.2017.06.005 | Medline

[5]

G. Abraham, M. Inouye.

Genomic risk prediction of complex human disease and its clinical application.

Curr Opin Genet Dev., (2015), 33 pp. 10-16

http://dx.doi.org/10.1016/j.gde.2015.06.005 | Medline

[6]

A. Torkamani, N.E. Wineinger, E.J. Topol.

The personal and clinical utility of polygenic risk scores.

Nat Rev Genet, (2018), 19 pp. 581-590

http://dx.doi.org/10.1038/s41576-018-0018-x | Medline

[7]

X. Wang, G. Strizich, Y. Hu, T. Wang, R.C. Kaplan, Q. Qi.

Genetic markers of type 2 diabetes: progress in genome-wide association studies and clinical application for risk prediction.

J Diabetes., (2016), 8 pp. 24-35

http://dx.doi.org/10.1111/1753-0407.12323 | Medline

[8]

L.M. Rincón, M. Sanmartín, G.L. Alonso, et al.

A genetic risk score predicts recurrent events after myocardial infarction in young adults.

Rev Esp Cardiol., (2020), 73 pp. 623-631

[9]

M. Nikpay, A. Goel, H.H. Won, et al.

A comprehensive 1,000 Genomes-based genome-wide association meta-analysis of coronary artery disease.

Nat. Genet., (2015), 47 pp. 1121-1130

http://dx.doi.org/10.1038/ng.3396 | Medline

[10]

C. Fuchsberger, J. Flannick, T.M. Teslovich, et al.

The genetic architecture of type 2 diabetes.

Nature., (2019), 536 pp. 41-47

http://dx.doi.org/10.1038/nature18642 | Medline

[11]

A.C. Morrison, L.A. Bare, L.E. Chambless, et al.

Prediction of coronary heart disease risk using a genetic risk score: the Atherosclerosis Risk in Communities Study.

Am J Epidemiol., (2007), 166 pp. 28-35

http://dx.doi.org/10.1093/aje/kwm060 | Medline

[12]

J.L. Mega, N.O. Stitziel, J.G. Smith, et al.

Genetic risk, coronary heart disease events, and the clinical benefit of statin therapy: An analysis of primary and secondary prevention trials.

Lancet., (2015), 385 pp. 2264-2271

http://dx.doi.org/10.1016/S0140-6736(14)61730-X | Medline

[13]

C. Iribarren, M. Lu, E. Jorgenson, et al.

Clinical Utility of Multimarker Genetic Risk Scores for Prediction of Incident Coronary Heart Disease: A Cohort Study Among Over 51 000 Individuals of European Ancestry.

Circ Cardiovasc Genet., (2016), 6 pp. 531-540

http://dx.doi.org/10.1161/CIRCGENETICS.113.000378 | Medline

[14]

A.R. Martin, C.R. Gignoux, R.K. Walters, et al.

Human demographic history impacts genetic risk prediction across diverse populations.

Am J Hum Genet., (2017), 100 pp. 635-649

http://dx.doi.org/10.1016/j.ajhg.2017.03.004 | Medline

[15]

G. Saunders, M. Baudis, R. Becker, et al.

Leveraging European infrastructures to access 1 million human genomes by 2022.

Nat Rev Genet., (2019), 20 pp. 702

http://dx.doi.org/10.1038/s41576-019-0178-3 | Medline

Year/month	Html	Pdf	Total
2025 July	43	11	54
2025 June	116	19	135
2025 May	100	25	125
2025 April	57	12	69
2025 March	51	7	58
2025 February	57	20	77
2025 January	68	17	85
2024 December	77	14	91
2024 November	75	36	111
2024 October	55	24	79
2024 September	66	6	72
2024 August	45	31	76
2024 July	65	12	77
2024 June	58	14	72
2024 May	70	9	79
2024 April	50	19	69
2024 March	61	21	82
2024 February	43	16	59
2024 January	51	25	76
2023 December	59	11	70
2023 November	62	10	72
2023 October	74	22	96
2023 September	54	17	71
2023 August	56	5	61
2023 July	108	20	128
2023 June	91	13	104
2023 May	92	17	109
2023 April	85	12	97
2023 March	107	18	125
2023 February	108	9	117
2023 January	86	23	109
2022 December	99	38	137
2022 November	114	18	132
2022 October	82	27	109
2022 September	71	24	95
2022 August	65	27	92
2022 July	60	21	81
2022 June	63	27	90
2022 May	74	31	105
2022 April	65	28	93
2022 March	95	49	144
2022 February	82	24	106
2022 January	100	20	120
2021 December	75	26	101
2021 November	80	29	109
2021 October	66	24	90
2021 September	92	24	116
2021 August	67	21	88
2021 July	46	23	69
2021 June	42	17	59
2021 May	64	25	89
2021 April	211	47	258
2021 March	211	22	233
2021 February	212	13	225
2021 January	158	15	173
2020 December	93	8	101
2020 November	58	4	62
2020 October	57	9	66
2020 September	95	11	106
2020 August	111	15	126
2020 July	67	9	76
2020 March	3	4	7
2020 February	14	10	24

Year/month	Html	Pdf	Total
2025 July	43	11	54
2025 June	116	19	135
2025 May	100	25	125
2025 April	57	12	69
2025 March	51	7	58
2025 February	57	20	77
2025 January	68	17	85
2024 December	77	14	91
2024 November	75	36	111
2024 October	55	24	79
2024 September	66	6	72
2024 August	45	31	76
2024 July	65	12	77
2024 June	58	14	72
2024 May	70	9	79
2024 April	50	19	69
2024 March	61	21	82
2024 February	43	16	59
2024 January	51	25	76
2023 December	59	11	70
2023 November	62	10	72
2023 October	74	22	96
2023 September	54	17	71
2023 August	56	5	61
2023 July	108	20	128
2023 June	91	13	104
2023 May	92	17	109
2023 April	85	12	97
2023 March	107	18	125
2023 February	108	9	117
2023 January	86	23	109
2022 December	99	38	137
2022 November	114	18	132
2022 October	82	27	109
2022 September	71	24	95
2022 August	65	27	92
2022 July	60	21	81
2022 June	63	27	90
2022 May	74	31	105
2022 April	65	28	93
2022 March	95	49	144
2022 February	82	24	106
2022 January	100	20	120
2021 December	75	26	101
2021 November	80	29	109
2021 October	66	24	90
2021 September	92	24	116
2021 August	67	21	88
2021 July	46	23	69
2021 June	42	17	59
2021 May	64	25	89
2021 April	211	47	258
2021 March	211	22	233
2021 February	212	13	225
2021 January	158	15	173
2020 December	93	8	101
2020 November	58	4	62
2020 October	57	9	66
2020 September	95	11	106
2020 August	111	15	126
2020 July	67	9	76
2020 March	3	4	7
2020 February	14	10	24

REVISTA ESPAÑOLA DE

CARDIOLOGÍA

Editorial
Polygenic risk score as a key factor in cardiovascular clinical prediction models

La puntuación de riesgo poligénico como factor clave en los modelos de predicción clínica cardiovascular

Table of contents

Options

Editorial Polygenic risk score as a key factor in cardiovascular clinical prediction models

La puntuación de riesgo poligénico como factor clave en los modelos de predicción clínica cardiovascular

Table of contents

Options

Editorial
Polygenic risk score as a key factor in cardiovascular clinical prediction models