Optimising the management of severe infections in intensive care with reinforcement learning.

Abstract Winner: Decision Support & Hospital Monitoring

Author: Matthieu Komorowski

BACKGROUND Sepsis, which corresponds to severe infections, is a major healthcare problem. It is the third highest cause of mortality worldwide and the most expensive condition treated in hospitals. Machine learning has been suggested as a new approach to assist decision in healthcare. A clear parallel can be established between the framework of reinforcement learning (RL) and medical decision. RL deals with the problem of an agent acting in an environment in order to maximize its reward, while a physician’s goal is to find a policy (therapeutic choices) so as to improve the recovery of his patient. A cornerstone of the early management of sepsis is to normalise tissue perfusion, through the administration of intravenous fluids and/or vasopressors. We hypothesized that the prescription of these drugs could be modelled as a Markov Decision Process (MDP) and optimised using RL. METHODS We identified a cohort of 19,449 patients with sepsis from MIMIC-III, a large intensive care database. An extensive set of parameters was extracted including demographics, vital signs and lab values. In a purely data driven approach, the state space of the MDP was generated by clustering raw time series into 2,000 states. The action space was defined by the dose of the two drugs of interest. Reward and penalty signals were assigned to survival and death, respectively. We applied offline policy evaluation on actual patients’ trajectories to estimate the policy of the physicians. This process estimated an offline Q function, which represents the relationship between the dose of drugs administered and the risk of mortality. The optimal policy was estimated using dynamic programming. We implemented a 5-fold cross-validation and all the results were averaged across 500 different models built from the dataset. RESULTS The median Q value of the physicians’ policy was 9 (Inter-quartile range 9.1) which corresponds to a mortality risk of 16.2% (Standard error of the mean 0.03%). The optimal policy recommended actions with a median Q value of 40.2 (IQR 24.1), consistent with a 9.25% (SEM 0.11%) mortality risk. Following the optimal policy could potentially lead to a 7% absolute reduction in mortality, which could translate into up to 90,000 lives saved annually in Europe. On average, the optimal policy recommends more vasopressors (mean difference 0.058 mcg/kg/min of norepinephrine-equivalent, SD 0.15, p<0.001) and less intravenous fluids (-249.68 mL per 4h, SD 1,144, p<0.001) than what was actually administered. The observed mortality of test subjects was plotted against the difference between the dose actually received and the dose recommended. The lowest mortality was found when the dose given was close to the dose recommended. Giving more or less of either drug led to worse outcomes, in a dose-dependent fashion. CONCLUSIONS AI can be applied to solve complex medical decision problems, and suggest therapeutic decisions at the patient level that are, on average, more optimal than those of physicians. Such research helps envisioning what the future of healthcare might look like, with intelligent tools embedded at the bedside and providing physicians with real time guidance on optimal patient management.

Co Author/Co-Investigator Names/Professional Title: Dr Matthieu Komorowski 1,2,3 ; Pr Leo A. Celi 3 ; Pr Anthony C Gordon 1 ; Dr A Aldo Faisal 2,4,5 1 Section of Anaesthetics, Pain Medicine and Intensive Care, Department of Surgery and Cancer, Imperial College London, UK 2 Department of Bioengineering, Imperial College London, UK 3 Laboratory of Computational Physiology, Harvard-MIT Division of Health Sciences and Technology, USA 4 Department of Computing, Imperial College London, UK 5 Medical Research Council Clinical Sciences Centre, UK

Funding Acknowledgement (If Applicable): Matthieu Komorowski is funded by the UK Engineering and Physical Sciences Research Council via an Imperial College President’s PhD Scholarship.