NLP | Center for Precision Medicine

Naïve Electronic Health Record phenotype identification for Rheumatoid arthritis.

Carroll RJ, Eyler AE, Denny JC. Naïve Electronic Health Record phenotype identification for Rheumatoid arthritis. AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium. 2011(2011). 189-96. PMID: 22195070 [PubMed] PMCID: PMC3243261

Electronic Health Records (EHRs) provide a real-world patient cohort for clinical and genomic research. Phenotype identification using informatics algorithms has been shown to replicate known genetic associations found in clinical trials and observational cohorts. However, development of accurate phenotype identification methods can be challenging, requiring significant time and effort.

Detecting abbreviations in discharge summaries using machine learning methods.

Wu Y, Rosenbloom ST, Denny JC, Miller RA, Mani S, Giuse DA, Xu H. Detecting abbreviations in discharge summaries using machine learning methods. AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium. 2011(2011). 1541-9. PMID: 22195219 [PubMed] PMCID: PMC3243185

Recognition and identification of abbreviations is an important, challenging task in clinical natural language processing (NLP). A comprehensive lexical resource comprised of all common, useful clinical abbreviations would have great applicability. The authors present a corpus-based method to create a lexical resource of clinical abbreviations using machine-learning (ML) methods, and tested its ability to automatically detect abbreviations from hospital discharge summaries.

Predicting warfarin dosage in European-Americans and African-Americans using DNA samples linked to an electronic health record.

Ramirez AH, Shi Y, Schildcrout JS, Delaney JT, Xu H, Oetjens MT, Zuvich RL, Basford MA, Bowton E, Jiang M, Speltz P, Zink R, Cowan J, Pulley JM, Ritchie MD, Masys DR, Roden DM, Crawford DC, Denny JC. Predicting warfarin dosage in European-Americans and African-Americans using DNA samples linked to an electronic health record. Pharmacogenomics. 2012 Mar;13(13). 407-18. PMID: 22329724 [PubMed] PMCID: PMC3361510 NIHMSID: NIHMS364371.

Warfarin pharmacogenomic algorithms reduce dosing error, but perform poorly in non-European-Americans. Electronic health record (EHR) systems linked to biobanks may allow for pharmacogenomic analysis, but they have not yet been used for this purpose.

Detecting temporal expressions in medical narratives.

Reeves RM, Ong FR, Matheny ME, Denny JC, Aronsky D, Gobbel GT, Montella D, Speroff T, Brown SH. Detecting temporal expressions in medical narratives. International journal of medical informatics. 2013 Feb;82(82). 118-27. PMID: 22595284 [PubMed]

Clinical practice and epidemiological information aggregation require knowing when, how long, and in what sequence medically relevant events occur. The Temporal Awareness and Reasoning Systems for Question Interpretation (TARSQI) Toolkit (TTK) is a complete, open source software package for the temporal ordering of events within narrative text documents. TTK was developed on newspaper articles. We extended TTK to support medical notes using veterans' affairs (VA) clinical notes and compared it to TTK.

A study of transportability of an existing smoking status detection module across institutions.

Liu M, Shah A, Jiang M, Peterson NB, Dai Q, Aldrich MC, Chen Q, Bowton EA, Liu H, Denny JC, Xu H. A study of transportability of an existing smoking status detection module across institutions. AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium. 2012(2012). 577-86. PMID: 23304330 [PubMed] PMCID: PMC3540509

Electronic Medical Records (EMRs) are valuable resources for clinical observational studies. Smoking status of a patient is one of the key factors for many diseases, but it is often embedded in narrative text. Natural language processing (NLP) systems have been developed for this specific task, such as the smoking status detection module in the clinical Text Analysis and Knowledge Extraction System (cTAKES). This study examined transportability of the smoking module in cTAKES on the Vanderbilt University Hospital's EMR data.

A comparative study of current Clinical Natural Language Processing systems on handling abbreviations in discharge summaries.

Wu Y, Denny JC, Rosenbloom ST, Miller RA, Giuse DA, Xu H. A comparative study of current Clinical Natural Language Processing systems on handling abbreviations in discharge summaries. AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium. 2012(2012). 997-1003. PMID: 23304375 [PubMed] PMCID: PMC3540461

Clinical Natural Language Processing (NLP) systems extract clinical information from narrative clinical texts in many settings. Previous research mentions the challenges of handling abbreviations in clinical texts, but provides little insight into how well current NLP systems correctly recognize and interpret abbreviations. In this paper, we compared performance of three existing clinical NLP systems in handling abbreviations: MetaMap, MedLEE, and cTAKES.

A hybrid system for temporal information extraction from clinical text.

Tang B, Wu Y, Jiang M, Chen Y, Denny JC, Xu H. A hybrid system for temporal information extraction from clinical text. Journal of the American Medical Informatics Association : JAMIA. 20(20). 828-35. PMID: 23571849 [PubMed] PMCID: PMC3756274

To develop a comprehensive temporal information extraction system that can identify events, temporal expressions, and their temporal relations in clinical text. This project was part of the 2012 i2b2 clinical natural language processing (NLP) challenge on temporal information extraction.

Automated identification of drug and food allergies entered using non-standard terminology.

Epstein RH, St Jacques P, Stockin M, Rothman B, Ehrenfeld JM, Denny JC. Automated identification of drug and food allergies entered using non-standard terminology. Journal of the American Medical Informatics Association : JAMIA. 20(20). 962-8. PMID: 23748627 [PubMed] PMCID: PMC3756276

An accurate computable representation of food and drug allergy is essential for safe healthcare. Our goal was to develop a high-performance, easily maintained algorithm to identify medication and food allergies and sensitivities from unstructured allergy entries in electronic health record (EHR) systems.

Applying active learning to high-throughput phenotyping algorithms for electronic health records data.

Chen Y, Carroll RJ, Hinz ER, Shah A, Eyler AE, Denny JC, Xu H. Applying active learning to high-throughput phenotyping algorithms for electronic health records data. Journal of the American Medical Informatics Association : JAMIA. 2013 Dec;20(20). e253-9. PMID: 23851443 [PubMed] PMCID: PMC3861916

Generalizable, high-throughput phenotyping methods based on supervised machine learning (ML) algorithms could significantly accelerate the use of electronic health records data for clinical and translational research. However, they often require large numbers of annotated samples, which are costly and time-consuming to review. We investigated the use of active learning (AL) in ML-based phenotyping algorithms.

Response to 'Use of an algorithm for identifying hidden drug-drug interactions in adverse event reports' by Gooden et al.

Tatonetti NP, Denny JC, Altman RB. Response to 'Use of an algorithm for identifying hidden drug-drug interactions in adverse event reports' by Gooden et al. Journal of the American Medical Informatics Association : JAMIA. 2013 May;20(20). 591 p. PMID: 23876381 [PubMed] PMCID: PMC3628071

RSS: