Neural Document Embeddings for Intensive Care Patient Mortality Prediction
Abstract Winner: Decision Support & Hospital Monitoring
Author: Stephanie Hyland
The growing adoption of electronic health records (EHRs) holds significant potential for exploitation by automatic inference and data-mining techniques. Besides a wide range of clinical research questions, the opportunity exists to apply data-drive methods in daily clinical practice for key tasks such as decision support or patient mortality prediction. The latter task is especially important in clinical practice when prioritizing allocation of scarce resources or determining the frequency and intensity of post-discharge care. A valuable part of the EHR comes in the form of text in clinical notes. Such unstructured, natural language data is more challenging to work with, yet contains important information about patient state. In this work, we focus on using such information for mortality prediction. Using the MIMIC-III intensive care database we show significant performance gains using a convolutional neural network model compared to a pre-existing topic modelling approach and a novel neural baseline with generic doc2vec embeddings. These improvements are especially pronounced for the difficult problem of post-discharge mortality prediction. We focus on three partially-overlapping mortality prediction tasks: a) in the ICU, b) after discharge within 30 days, c) after discharge within 1 year. We restrict to adults with text data, and exclude from consideration notes from the 'Discharge summary' category, and anything recorded after the patient was discharged. We compare to two baselines: the LDA-based Retrospective Topic Model from Ghassemi et al., SIGKDD 2014, and a baseline using the distributed bag of words model (doc2vec) from Le and Mikolov (ICML 2014). In both cases, the derived text features are used in separate linear SVMs for each task. A drawback of these approaches is that they cannot recognise multi-word or multi-sentence patterns. Such complex constructions are common in medical text and contain important information which should not be discarded. Following recent work in document classification, we adopt a two-layer architecture for our model. The first layer independently maps sentences to sentence vectors. The second layer combines these sentences into a patient representation. Both layers use convolutional neural networks with max-pooling. The output of the model is the estimated mortality probability (for the given task), using cross-entropy with the ground truth label as the objective. Given the number of sentences per patient, it is necessary to replicate the loss at intermediate steps (target replication), which amounts to adding a term to our objective representing the average prediction error at the sentence level. Our sentence representation can easily incorporate additional side information, allowing us to use the note's category via a low-dimensional embedding. Using a random 80/10/10 train/validation/test split, our model achieves AUC of 0.963, 0.800, and 0.796 on the three tasks respectively. The doc2vec baseline achieves 0.930, 0.773, and 0.770. The LDA model, which we reproduced from Ghassemi (2014) on MIMIC-III, achieves 0.930, 0.745, and 0.730. The superior performance of the CNN-based model leads us to conclude that accounting for word and phrase compositionality is crucial for identifying important text patterns. These results suggest promising directions for the use of machine learning to assist in medical practice.
Co Author/Co-Investigator Names/Professional Title: Paulina Grnarova & Florian Schmidt (equal contribution) Stephanie L. Hyland Carsten Eickhoff
Funding Acknowledgement (If Applicable): ETH Zurich