Missing data methods for intensive care unit SOFA scores in electronic health records studies: results from a Monte Carlo simulation
Publication: Journal of Comparative Effectiveness Research
Abstract
Aim: Missing data cause problems through decreasing sample size and the potential for introducing bias. We tested four missing data methods on the Sequential Organ Failure Assessment (SOFA) score, an intensive care research severity adjuster. Methods: Simulation study using 2015–2017 electronic health record data, where the complete dataset was sampled, missing SOFA score elements imposed and performance examined of four missing data methods – complete case analysis, median imputation, zero imputation (recommended by SOFA score creators) and multiple imputation (MI) – on the outcome of in-hospital mortality. Results: MI performed well, whereas other methods introduced varying amounts of bias or decreased sample size. Conclusion: We recommend using MI in analyses where SOFA score component values are missing in administrative data research.
Supplementary Material
File (supplementary material.docx)
- Download
- 431.52 KB
References
Papers of special note have been highlighted as: • of interest; •• of considerable interest
1.
Vincent JL, De Mendonca A, Cantraine F et al. Use of the SOFA score to assess the incidence of organ dysfunction/failure in intensive care units: results of a multicenter, prospective study. Working group on “sepsis-related problems” of the European Society of Intensive Care Medicine. Crit. Care Med. 26(11), 1793–1800 (1998).
• Validation study of the SOFA score
2.
Strand K, Flaatten H. Severity scoring in the ICU: a review. Acta Anaesthesiol. Scand. 52(4), 467–478 (2008).
3.
Buyse S, Teixeira L, Galicier L et al. Critical care management of patients with hemophagocytic lymphohistiocytosis. Intensive Care Med. 36(10), 1695–1702 (2010).
4.
Neto AS, Barbas CSV, Simonis FD et al. Epidemiological characteristics, practice of ventilation, and clinical outcome in patients at risk of acute respiratory distress syndrome in intensive care units from 16 countries (PRoVENT): an international, multicentre, prospective study. Lancet Resp. Med. 4(11), 882–893 (2016).
5.
Ferreira FL, Bota DP, Bross A, Melot C, Vincent JL. Serial evaluation of the SOFA score to predict outcome in critically ill patients. JAMA 286(14), 1754–1758 (2001).
6.
Singer M, Deutschman CS, Seymour CW et al. The Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3). JAMA 315(8), 801–810 (2016).
•• Sepsis-3 guidelines, which state one is to assume no organ derangement unless the patient has a known organ dysfunction
7.
Teasdale G, Jennett B. Assessment of coma and impaired consciousness. A practical scale. Lancet 2(7872), 81–84 (1974).
8.
Sessler CN, Gosnell MS, Grap MJ et al. The Richmond Agitation–Sedation Scale: validity and reliability in adult intensive care unit patients. Am. J. Respir. Crit. Care Med. 166(10), 1338–1344 (2002).
9.
Sessler CN, Grap MJ, Brophy GM. Multidisciplinary management of sedation and analgesia in critical care. Presented at: Semin. Respir. Crit. Care Med. (2001).
10.
Ely EW, Inouye SK, Bernard GR et al. Delirium in mechanically ventilated patients: validity and reliability of the confusion assessment method for the intensive care unit (CAM-ICU). JAMA 286(21), 2703–2710 (2001).
11.
Charlson ME, Pompei P, Ales KL, Mackenzie CR. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J. Chronic Dis. 40(5), 373–383 (1987).
12.
Deyo RA, Cherkin DC, Ciol MA. Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases. J. Clin. Epidemiol. 45(6), 613–619 (1992).
13.
Burton A, Altman DG, Royston P, Holder RL. The design of simulation studies in medical statistics. Stat. Med. 25(24), 4279–4292 (2006).
14.
Rubin DB. Multiple Imputation after 18+ years. J. Am. Stat. Assoc. 91(434), 473–489 (1996).
•• Review of multiple imputation (MI) framework and gives response to criticism of MI, comparing alternative strategies.
15.
Rubin DB. Multiple imputation for nonresponse in surveys. John Wiley & Sons, NY, USA (1987).
• Seminal article on MI in research.
16.
Molenberghs G, Beunckens C, Sotto C, Kenward MG. Every missingness not at random model has a missingness at random counterpart with equal fit. J. Roy. Stat. Soc. Ser. B. (Stat. Method.) 70(2), 371–388 (2008).
17.
Bell ML, Fairclough DL, Fiero MH, Butow PN. Handling missing items in the Hospital Anxiety and Depression Scale (HADS): a simulation study. BMC Res. Notes 9(1), 479 (2016).
18.
Schafer JL. Analysis of Incomplete Multivariate Data. Chapman & Hall, Boca Raton, FL. (1997).
19.
Von Hippel PT. Regression with missing Ys: an improved strategy for analyzing multiply imputed data. Sociological Methodol. 37, 83–117 (2007).
20.
White IR, Daniel R, Royston P. Avoiding bias due to perfect prediction in multiple imputation of incomplete categorical variables. Comput. Stat. Data Anal. 54(10), 2267–2275 (2010).
21.
White IR, Royston P, Wood AM. Multiple imputation using chained equations: Issues and guidance for practice. Stat. Med. 30(4), 377–399 (2011).
22.
Li P, Stuart EA, Allison DB. Multiple imputation: a flexible tool for handling missing data. JAMA 314(18), 1966–1967 (2015).
•• A very approachable primer on MI and missing data mechanisms
23.
Yourman LC, Lee SJ, Schonberg MA, Widera EW, Smith AK. Prognostic indices for older adults: a systematic review. JAMA 307(2), 182–192 (2012).
Information & Authors
Information
Published In
Pages: 47 - 56
PubMed: 34726477
Copyright
© 2021 Daniel L Brinton. This work is licensed under the Attribution-NonCommercial-NoDerivatives 4.0 Unported License
History
Received: 22 March 2021
Accepted: 14 October 2021
Published online: 2 November 2021
Keywords:
Topics
Authors
Funding Information
Health Resources and Services Administration (HRSA): U66 RH31458-01-00
Metrics & Citations
Metrics
Article Usage
Article usage data only available from February 2023. Historical article usage data, showing the number of article downloads, is available upon request.
Citations
How to Cite
Missing data methods for intensive care unit SOFA scores in electronic health records studies: results from a Monte Carlo simulation. (2021) Journal of Comparative Effectiveness Research. DOI: 10.2217/cer-2021-0079
Export citation
Select the citation format you wish to export for this article or chapter.
Citing Literature
- Qingyun Peng, Yanzi Guo, Haoyuan Tang, Shuai Liu, Wei Huang, Xinlong Chen, Shijia Zhong, Zeyuan Zhao, Haofei Wang, Wenhan Hu, Shuhe Yang, Jianfeng Xie, Ming Xue, Shuyuan Qian, Xiaojing Wu, Yingzi Huang, Interpretable machine learning for early prediction of sepsis-induced coagulopathy: a multicenter retrospective development and validation study, BMC Medical Informatics and Decision Making, 10.1186/s12911-026-03471-8, (2026).
- Jiafei Yu, Kangwei Sun, Yiping Zhou, Yushi Fan, Xinyun Zhang, Heyu Chen, Lanxin Cao, Kai Zhang, Gensheng Zhang, Update of the sequential organ failure assessment score: current status and challenges?, Frontiers in Medicine, 10.3389/fmed.2025.1733090, 12, (2026).
- Hannah F. Wang, Beena Cheriyan, Marianne Huebner, Sichao Wang, David M. Sudekum, Comparing Vasopressin and Hydrocortisone as Adjunctive Measures in Septic Shock, Annals of Pharmacotherapy, 10.1177/10600280251406750, (2026).
- Otavio T. Ranzani, Mervyn Singer, Jorge I. F. Salluh, Manu Shankar-Hari, David Pilcher, Joana Berger-Estilita, Craig M. Coopersmith, Nicole P. Juffermans, John Laffey, Matti Reinikainen, Ary Serpa Neto, Miguel Tavares, Jean-François Timsit, Maria Del Pilar Arias Lopez, Nish Arulkumaran, Diptesh Aryal, Elie Azoulay, Leo Anthony Celi, Dipayan Chaudhuri, Dylan De Lange, Jan De Waele, Claudia C. Dos Santos, Bin Du, Sharon Einav, Teresa Engelbrecht, Fathima Fazla, Ricard Ferrer, Stefano Finazzi, Tomoko Fujii, Hayley B. Gershengorn, John D. Greene, Rashan Haniffa, Sicheng Hao, Mohd Shahnaz Hasan, Steve Hollenberg, Mariachiara Ippolito, Christian Jung, Mikhail Kirov, Shigetaka Kobari, Inès Lakbar, Jeffrey Lipman, Vincent Liu, Xiaoli Liu, Suzana M. Lobo, Demetrio Magatti, Greg S. Martin, Barbara Metnitz, Philipp Metnitz, Sheila N. Myatra, Simon Oczkowski, José-Artur Paiva, Fathima Paruk, Pirkka T. Pekkarinen, Lise Piquilloud, Anssi Pölkki, Hallie C. Prescott, Annika Reintam Blaser, Ederlon Rezende, Chiara Robba, Bram Rochwerg, Stephane Ruckly, Rasoul Samei, Edward J. Schenck, Paul Secombe, Cornelius Sendagire, Moses Siaw-Frimpong, Andrew J. Simpkin, Márcio Soares, Charlotte Summers, Wojciech Szczeklik, Jukka Takala, Shiro Tanaka, Giovanni Tricella, Jean-Louis Vincent, Julia Wendon, Fernando G. Zampieri, Andrew Rhodes, Rui Moreno, Development and Validation of the Sequential Organ Failure Assessment (SOFA)-2 Score, JAMA, 10.1001/jama.2025.20516, 334, 23, (2090), (2025).
- Jianan Zhu, Deepak Pradhan, I. Obi Emeruwa, B. Corbett Walsh, The Hidden Bias of Missing Data in Crisis Standards of Care Simulation Studies: Not So Random, Rethinking Missing Data in Crisis Standards of Care Simulation Studies, Disaster Medicine and Public Health Preparedness, 10.1017/dmp.2025.10239, 19, (2025).
- Renée A.M. Tuinte, Luuk P.J. Smolenaers, Bram T. Knoop, Konstantin Föhse, Tamar J. van der Aart, Hjalmar R. Bouma, Mihai G. Netea, Katrijn Van Deun, Jaap ten Oever, Jacobien J. Hoogerwerf, Development and validation of an interpretable machine learning model for retrospective identification of suspected infection for sepsis surveillance: a multicentre cohort study, eClinicalMedicine, 10.1016/j.eclinm.2025.103401, 87, (103401), (2025).
- Emily E. Moin, Nicholas J. Seewald, Scott D. Halpern, Use of Life Support and Outcomes Among Patients Admitted to Intensive Care Units, JAMA, 10.1001/jama.2025.2163, 333, 20, (1793), (2025).
- Tara M. Westover, Marta B. Fernandes, M. Brandon Westover, Sahar F. Zafar, An Immediate Mortality Prediction Score That is Robust to Missing Data, Open Journal of Statistics, 10.4236/ojs.2025.151005, 15, 01, (73-80), (2025).
- Denise Molinnus, Michael Beulertz, Johannes Bickenbach, Gernot Marx, Carina Benstoem, Observational study of missing SOFA score data frequency in RCTs relative to ICU length of stay, Scientific Reports, 10.1038/s41598-024-67089-4, 14, 1, (2024).
- Mohammad Alrawashdeh, Michael Klompas, Chanu Rhee, The Impact of Common Variations in Sequential Organ Failure Assessment Score Calculation on Sepsis Measurement Using Sepsis-3 Criteria: A Retrospective Analysis Using Electronic Health Record Data, Critical Care Medicine, 10.1097/CCM.0000000000006338, 52, 9, (1380-1390), (2024).
