PhD Defense, 9/30: Kevin "KJ" Krause | Department of Biomedical Informatics

September 28, 2022

Below is this week's PhD student defense. See details below:

FRIDAY, SEPTEMBER 30, 2022 at 9:00 AM CT

krause

Title: "Applying Network Analysis and Supervised Learning to Electronic Clinical Notes to Improve Operational Suicide Risk Prevention at an Academic Medical Center"

Location: 2525 West End Avenue, 8th Floor, Room 8110
ZOOM LINK HERE

Abstract: Suicide is a significant and rising threat to public health. In the United States, 47,500 people died from suicide in 2019, a 10-year increase of 30%. Informaticians studying suicide are working on improved ways to analyze suicide risk factors and predict the probability of future suicidal behavior.

Vanderbilt University Medical Center currently identifies at-risk patients with the Vanderbilt Suicide Attempt and Ideation Likelihood risk model (VSAIL), an operational suicide prevention tool using structured electronic health record data to predict suicide attempts. VSAIL does not consider free-text clinical notes, which have proven effective in many other clinical prediction models. In these studies, we use clinical notes to identify suicide risk factors and improve the VSAIL risk model. We ascertain suicide attempt events with billing codes and use natural language processing to extract bag-of-words representations of clinical notes from the Health Data Repository Initiative at VUMC.

We generate cooccurrence-based risk networks for suicidal ideation, suicide attempt, and suicidal ideation evolving to suicide attempt. We train machine learning models (logistic regression, random forest, and gradient boosting machine) to predict suicide attempts within 30 days of a hospital visit using the clinical terms present clinical notes from the prior 90 days. We also train an early-fusion model on the combined features from the structured and clinical note models, and a late-fusion model on the predictions made by the structured and clinical note models.

Last, we compare the model performances (structured, clinical note, early-fusion, and late-fusion) on a common validation test set to select the optimal model. Our networks replicate existing risk factor findings and provide additional insight into the degree to which risk factors behave as independent morbidities or as interacting comorbidities with other risk factors. On a 240k patient validation cohort, our clinical note model outperforms the previously developed structured model (average precision = 0.33 and 0.19, respectively; p < 0.001), and that the late-fusion model outperforms all other models (average precision = 0.41, p < 0.001). These results suggest that clinical notes alone contain rich information absent from the structured record and that the structured and clinical note data complement one-another as inputs to clinical prediction models.