Why Not Use Existing Knowledge: Bayesian Statistics

doi:10.1016/j.rec.2016.08.015

Ampliar

Vol. 69. Num. 12.

Pages 1234-1235 (December 2016)

Letter to the editor
Why Not Use Existing Knowledge: Bayesian Statistics

Por qué no utilizar el conocimiento previo: la estadística bayesiana

Daniel Hernández-Vaqueroa Rocío DíazaJacobo SilvaaCésar Morísa b

Prevalence of General Obesity and Abdominal Obesity in the Spanish Adult Population (Aged 25–64 Years) 2014–2015: The ENPE Study

Rev Esp Cardiol. 2016;69:579-8710.1016/j.rec.2016.02.009

Javier Aranceta-Bartrina, Carmen Pérez-Rodrigo, Goiuri Alberdi-Aresti, Natalia Ramos-Carrera, Sonia Lázaro-Masedo

Why Not Use Existing Knowledge: Bayesian Statistics. Response

Rev Esp Cardiol. 2016;69:1235-610.1016/j.rec.2016.09.029

Javier Aranceta-Bartrina, Carmen Pérez-Rodrigo, Natalia Ramos-Carrera, Sonia Lázaro-Masedo

https://doi.org/10.1016/j.rec.2016.08.015

View PDF

Options

Other articles of interest

Statistics

2016 Total PDF

3919 Total HTML

Year/month	Html	Pdf	Total
2025 July	25	7	32
2025 June	105	20	125
2025 May	70	35	105
2025 April	40	21	61
2025 March	50	20	70
2025 February	49	20	69
2025 January	82	19	101
2024 December	81	17	98
2024 November	52	35	87
2024 October	42	37	79
2024 September	57	11	68
2024 August	46	33	79
2024 July	32	9	41
2024 June	33	14	47
2024 May	32	14	46
2024 April	28	22	50
2024 March	31	15	46
2024 February	32	18	50
2024 January	23	19	42
2023 December	24	35	59
2023 November	28	16	44
2023 October	42	37	79
2023 September	24	19	43
2023 August	23	5	28
2023 July	31	14	45
2023 June	26	27	53
2023 May	24	23	47
2023 April	12	10	22
2023 March	19	10	29
2023 February	13	13	26
2023 January	22	9	31
2022 December	33	13	46
2022 November	24	18	42
2022 October	34	16	50
2022 September	26	16	42
2022 August	29	18	47
2022 July	24	24	48
2022 June	16	21	37
2022 May	17	17	34
2022 April	28	12	40
2022 March	23	31	54
2022 February	18	12	30
2022 January	27	22	49
2021 December	23	25	48
2021 November	24	28	52
2021 October	18	28	46
2021 September	16	27	43
2021 August	20	22	42
2021 July	11	20	31
2021 June	15	11	26
2021 May	29	14	43
2021 April	57	46	103
2021 March	61	21	82
2021 February	43	8	51
2021 January	42	6	48
2020 December	49	16	65
2020 November	21	14	35
2020 October	50	9	59
2020 September	30	4	34
2020 August	29	8	37
2020 July	23	10	33
2020 June	10	9	19
2020 May	22	16	38
2020 April	21	11	32
2020 March	20	14	34
2020 February	32	4	36
2020 January	22	12	34
2019 December	72	22	94
2019 November	31	12	43
2019 October	25	4	29
2019 September	38	17	55
2019 August	34	27	61
2019 July	90	56	146
2019 June	74	70	144
2019 May	56	51	107
2019 April	31	19	50
2019 March	38	18	56
2019 February	42	23	65
2019 January	47	17	64
2018 December	68	28	96
2018 November	53	19	72
2018 October	64	17	81
2018 September	19	15	34
2018 August	29	35	64
2018 July	57	14	71
2018 June	56	18	74
2018 May	60	24	84
2018 April	34	18	52
2018 March	42	9	51
2018 February	43	13	56
2018 January	70	10	80
2017 December	42	8	50
2017 November	42	19	61
2017 October	34	10	44
2017 September	43	18	61
2017 August	63	28	91
2017 July	42	14	56
2017 June	39	16	55
2017 May	40	21	61
2017 April	47	18	65
2017 March	46	14	60
2017 February	45	29	74
2017 January	24	15	39
2016 December	54	43	97

To the Editor,

We read with interest the article by Aranceta-Bartrina et al.,1 whose objective was “to describe the prevalences of overall obesity and abdominal obesity in a representative sample of the Spanish population”.

We presume that the authors’ true objective was to describe not the prevalence of obesity in the sample, but rather the true prevalence of obesity in the Spanish population. To do so, they selected a sample of 3966 individuals, ensuring it was representative, and then used it to calculate the percentage of individuals with obesity. To extrapolate these results to the Spanish population, they calculated 95% confidence intervals.

Frequentist statistics based on significance tests, confidence intervals, and hypothesis testing are widely used nowadays. The main advantages of this approach are its simplicity and easy reproducibility, as many of the calculations can be done manually. The main disadvantage is that it does not provide a rational answer to clinical questions. The original question, “What is the true prevalence of obesity in the Spanish population?” cannot be answered intelligibly using this type of statistics.

The authors1 state that the rate of obesity was 21.6% (95% confidence interval, 19.0%-24.2%). To understand this interval, one must imagine taking repeated samples using the same model, such that in 95% of those samples, the intervals include the true population value.2 Although difficult to understand, this does not mean that there is a 95% probability that the prevalence of obesity in the Spanish population is between 19% and 24.2%; therefore, it does not address the original question.

Bayesian statistics are an alternative to frequentist statistics. The Bayesian approach is more complex and may require Markov chain Monte Carlo simulations,2,3 but it has the advantage of intuitively answering questions such as this one and it takes existing knowledge into account. Instead of “confidence intervals”, it uses “credible intervals”. The credible interval is the range in which there is a 95% probability of finding, for example, the true population value.

This type of statistics is based on Bayes theorem. It uses prior probability, along with experience or observation, to calculate the a posteriori probability. This means that each new study is seen not as separate or independent from existing knowledge, but as adding new information and contributing to the creation of new knowledge; this then serves as a starting point for subsequent studies.2

Reading this article, one is reminded of the 2012 publication by Gutiérrez-Fisac et al.,4 whose objective was also to describe the prevalence of obesity in Spain by studying 12 883 individuals. According to the data provided, the prevalence of obesity in persons aged between 18 and 64 years in their sample was 19.78%. If Bayesian statistics were used, it would then take these data as existing information to subsequently obtain deeper knowledge by calculating the credible interval.

In this approach, for example, if one takes a beta distribution as the a priori probability of obesity (1 898.7700),4 with the variable obesity and a Bernoulli distribution, and if one then adds the data obtained by Aranceta-Bartrina et al.,1 after 12 500 iterations and a burn-in period of 2500, one would obtain an a posteriori obesity prevalence of 20.1% with a 95% credible interval of 19.4% to 20.8%. That is, this time there would indeed be a 95% probability that the overall prevalence of obesity in Spain is between 19.4% and 20.8%. The Figure shows a histogram representing the distribution of obesity according to Markov chain Monte Carlo simulations.

Figure.

Histogram representing the obesity variable after 12 500 Markov chain Monte Carlo iterations using the Metropolis-Hasting algorithm.

(0.05MB).

This coincides almost exactly with the confidence interval provided by Aranceta-Bartrina et al.1 (19%-24.2%), because when studies are similar in design, the confidence interval and the credible interval tend to be similar,2 although this is not necessarily the case. If Bayesian statistics are not used, there are 2 options: pay attention to only 1 of the studies and ignore the other (even if the methodology of both is appropriate) or conduct a third study that generates more evidence and acts as a “tie breaker”, even in the knowledge that it will not answer the original question.

References

[1]

J. Aranceta-Bartrina, C. Pérez-Rodrigo, G. Alberdi-Aresti, N. Ramos-Carrera, S. Lázaro-Masedo.

Prevalence of general obesity and abdominal obesity in the Spanish adult population (aged 25-64 years) 2014-2015: The ENPE study.

Rev Esp Cardiol., (2016), 69 pp. 579-587

http://dx.doi.org/10.1016/j.rec.2016.02.009 | Medline

[2]

J. Thompson.

The problem of priors.

Bayesian analysis with STATA., Stata Press, pp. 1-8

[3]

M. Gandhi, B. Mukherjee, D. Biswas.