An Extension of the Bland–Altman Plot for Analyzing the Agreement of More than Two Raters
Abstract
:1. Introduction
2. Materials and Methods
2.1. Derivation of an LOA for m > 2 Raters
2.2. Suggestion for Indicating Bias
Algorithm 1Extended Bland–Altman plot for multiple raters. |
Observe number of raters m and number of subjects n and individual ratings Determine each subject’s mean and standard deviation: Determine the position of the LOA: Estimate the 95% confidence interval for the LOA by bootstrapping Determine bias indicators for each rater: Plot , L with the confidence interval, and for all raters in a figure |
2.3. Simulation Study on Coverage
3. Results
3.1. Simulation Results
3.2. Application to Real World Data
4. Discussion
4.1. Statement of Principal Findings
4.2. Strengths and Weaknesses of the Study
4.3. Strengths and Weaknesses in Relation to Other Studies, Particularly any Differences in the Results
4.4. Meaning of the Study: Possible Mechanisms and Implications for Clinicians or Policymakers
4.5. Unanswered Questions and Future Research
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
BA plot | Bland–Altman plot |
LOA | limit of agreement |
References
- Gerke, O.; Möller, S.; Debrabant, B.; Halekoh, U. Experience from applying the Guidelines for Reporting Reliability and Agreement Studies (GRRAS) indicated 5 questions to be addressed in the planning phase from a statistical point of view. Diagnostics 2018, 8, 69. [Google Scholar] [CrossRef] [Green Version]
- Carstensen, B. Comparing and predicting between several methods of measurement. Biostatistics 2004, 5, 399–413. [Google Scholar] [CrossRef] [PubMed]
- Carstensen, B. Comparing Clinical Measurement Methods; Wiley: Hoboken, NJ, USA, 2010. [Google Scholar]
- Bland, J.M.; Altman, D.G. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986, 327, 307–310. [Google Scholar] [CrossRef]
- Abu-Arafeh, A.; Jordan, H.; Drummond, G. Reporting of method comparison studies: A review of advice, an assessment of current practice, and specific suggestions for future reports. Br. J. Anaesth 2016, 117, 569–575. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Flegal, K.M.; Graubard, B.; Ioannidis, J.P.A. Use and reporting of Bland–Altman analyses in studies of self-reported versus measured weight and height. Int. J. Obes. 2019, 44, 1311–1318. [Google Scholar] [CrossRef]
- Gerke, O. Reporting Standards for a Bland-Altman Agreement Analysis: A Review of Methodological Reviews. Diagnostics 2020, 10, 334. [Google Scholar] [CrossRef]
- Bland, J.M.; Altman, D.G. Measuring agreement in method comparison studies. Stat. Methods Med. Res. 1999, 8, 135–160. [Google Scholar] [CrossRef]
- Jones, M.; Dobson, A.; O’Brian, S. A graphical method for assessing agreement with the mean between multiple observers using continuous measures. Int. J. Epidemiol. 2011, 40, 1308–1313. [Google Scholar] [CrossRef] [Green Version]
- Proschan, M.A.; Leifer, E.S. Comparison of two or more measurement techniques to a standard. Contemp. Clin. Trials 2006, 27, 472–482. [Google Scholar] [CrossRef]
- Scott, L.E.; Galpin, J.S.; Glencross, D.K. Multiple method comparison: Statistical model using percentage similarity. Cytom. B Clin. Cytom. 2003, 54, 46–53. [Google Scholar] [CrossRef]
- Taffé, P. Effective plots to assess bias and precision in method comparison studies. Stat. Methods Med. Res. 2018, 27, 1650–1660. [Google Scholar] [CrossRef] [PubMed]
- Carstensen, B. Comparing methods of measurement: Extending the LoA by regression. Stat. Med. 2010, 29, 401–410. [Google Scholar] [CrossRef] [PubMed]
- Vock, M. Intervals for the assessment of measurement agreement: Similarities, differences, and consequences of incorrect interpretations. Biom. J. 2016, 58, 489–501. [Google Scholar] [CrossRef] [PubMed]
- Carkeet, A. Exact parametric confidence intervals for Bland-Altman limits of agreement. Optom. Vis. Sci. 2015, 92, 71–80. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Morris, T.P.; White, I.R.; Crowther, M.J. Using simulation studies to evaluate statistical methods. Stat. Med. 2019, 38, 2074–2102. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2020. [Google Scholar]
- Venables, W.N.; Ripley, B.D. Modern Applied Statistics with S, 4th ed.; Springer: New York, NY, USA, 2002; ISBN 0-387-95457-0. [Google Scholar]
- Canty, A.; Ripley, B.D. boot: Bootstrap R (S-Plus) Functions, R Package Version 1.3-24. 2019. Available online: https://cran.r-project.org/web/packages/boot/ (accessed on 31 December 2020).
- Davison, A.C.; Hinkley, D.V. Bootstrap Methods and Their Applications; Cambridge University Press: Cambridge, UK, 1997; ISBN 0-521-57391-2. [Google Scholar]
- Wiinholt, A.; Gerke, O.; Dalaei, F.; Bučan, A.; Madsen, C.B.; Sørensen, J.A. Quantification of tissue volume in the hindlimb of mice using microcomputed tomography images and analysing software. Sci. Rep. 2020, 10, 8297. [Google Scholar] [CrossRef]
- Carstensen, B.; Lindström, J.; Sundvall, J.; Borch-Johnsen, K.; Tuomilehto, J.; Aunola, S.; Cepaitis, Z.; Eriksson, J.; Hakumäki, M.; Hämäläinen, H.; et al. Measurement of blood glucose: Comparison between different types of specimens. Ann. Clin. Biochem. 2008, 45, 140–148. [Google Scholar] [CrossRef] [Green Version]
- Carstensen, B.; Gurrin, L.; Ekstrøm, C.T.; Figurski, M. MethComp: Analysis of Agreement in Method Comparison Studies, R package version 1.30.0. 2020. Available online: https://rdrr.io/cran/MethComp/ (accessed on 31 December 2020).
- Nawarathna, L.S.; Choudhary, P.K. Measuring agreement in method comparison studies with heteroscedastic measurements. Stat. Med. 2013, 32, 5156–5171. [Google Scholar] [CrossRef] [Green Version]
- Nawarathna, L.S.; Choudhary, P.K. A heteroscedastic measurement error model for method comparison data with replicate measurements. Stat. Med. 2015, 34, 1242–1258. [Google Scholar] [CrossRef]
- Taffé, P.; Peng, M.; Stagg, V.; Williamson, T. Method Compare: An R package to assess bias and precision in method comparison studies. Stat. Methods Med. Res. 2019, 28, 2557–2565. [Google Scholar] [CrossRef]
- Taffé, P.; Peng, M.; Stagg, V.; Williamson, T. biasplot: A package to effective plots to assess bias and precision in method comparison studies. Stat. J. 2017, 17, 208–221. [Google Scholar] [CrossRef] [Green Version]
- Taffé, P.; Halfon, P.; Halfon, M. A new statistical methodology overcame the defects of the Bland-Altman method. J. Clin. Epidemiol. 2020, 124, 1–7. [Google Scholar] [CrossRef] [PubMed]
- Chhapola, V.; Kanwal, S.K.; Brar, R. Reporting standards for Bland-Altman agreement analysis in laboratory research: A cross-sectional survey of current practice. Ann. Clin. Biochem. 2015, 52, 382–386. [Google Scholar] [CrossRef] [PubMed]
Original Observations | New Observations | ||||||
---|---|---|---|---|---|---|---|
n= 10 | n= 20 | n= 100 | n= 10 | n= 20 | n= 100 | ||
Raters | Formula (1) | 95% Quantile | 95% Quantile | 95% Quantile | 95% Quantile | 95% Quantile | 95% Quantile |
m = 2 | 1.959964 | 1.907143 | 1.939001 | 1.954703 | 2.188023 | 2.069437 | 1.963274 |
m = 3 | 1.730818 | 1.681992 | 1.709057 | 1.726664 | 1.880320 | 1.780882 | 1.741569 |
m = 4 | 1.613973 | 1.572103 | 1.609547 | 1.612125 | 1.706937 | 1.654243 | 1.624840 |
m = 5 | 1.540108 | 1.525253 | 1.536958 | 1.534751 | 1.594541 | 1.57577 | 1.552975 |
Coverage | Coverage | Coverage | Coverage | Coverage | Coverage | ||
m = 2 | 0.9575 | 0.9528 | 0.9506 | 0.9247 | 0.9400 | 0.9497 | |
m = 3 | 0.9594 | 0.9542 | 0.9507 | 0.9253 | 0.9404 | 0.9488 | |
m = 4 | 0.9595 | 0.9539 | 0.9510 | 0.9297 | 0.9417 | 0.9479 | |
m = 5 | 0.9597 | 0.9541 | 0.9508 | 0.9361 | 0.9408 | 0.9462 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Möller, S.; Debrabant, B.; Halekoh, U.; Petersen, A.K.; Gerke, O. An Extension of the Bland–Altman Plot for Analyzing the Agreement of More than Two Raters. Diagnostics 2021, 11, 54. https://doi.org/10.3390/diagnostics11010054
Möller S, Debrabant B, Halekoh U, Petersen AK, Gerke O. An Extension of the Bland–Altman Plot for Analyzing the Agreement of More than Two Raters. Diagnostics. 2021; 11(1):54. https://doi.org/10.3390/diagnostics11010054
Chicago/Turabian StyleMöller, Sören, Birgit Debrabant, Ulrich Halekoh, Andreas Kristian Petersen, and Oke Gerke. 2021. "An Extension of the Bland–Altman Plot for Analyzing the Agreement of More than Two Raters" Diagnostics 11, no. 1: 54. https://doi.org/10.3390/diagnostics11010054
APA StyleMöller, S., Debrabant, B., Halekoh, U., Petersen, A. K., & Gerke, O. (2021). An Extension of the Bland–Altman Plot for Analyzing the Agreement of More than Two Raters. Diagnostics, 11(1), 54. https://doi.org/10.3390/diagnostics11010054