Modeling drug exposure data in electronic medical records: an application to warfarin.

Abstract

Identification of patients' drug exposure information is critical to drug-related research that is based on electronic medical records (EMRs). Drug information is often embedded in clinical narratives and drug regimens change frequently because of various reasons like intolerance or insurance issues, making accurate modeling challenging. Here, we developed an informatics framework to determine patient drug exposure histories from EMRs by combining natural language processing (NLP) and machine learning (ML) technologies. Our framework consists of three phases: 1) drug entity recognition - identifying drug mentions; 2) drug event detection - labeling drug mentions with a status (e.g., "on" or "stop"); and 3) drug exposure modeling - predicting if a patient is taking a drug at a given time using the status and temporal information associated with the mentions. We applied the framework to determine patient warfarin exposure at hospital admissions and achieved 87% precision, 79% recall, and an area under the receiver-operator characteristic curve of 0.93.