Development of an ensemble resource linking MEDications to their Indications (MEDI).

Understanding of medications-disease relationships is critical to distinguish indications from adverse effects, and medication exposures serve as important markers of disease and severity in electronic medical records (EMR). We created a computable medication-indication (MEDI) resource by applying natural language processing and ontology relationships to four public medication resources. Physicians evaluated accuracy of medication-indication relationships.

A natural language processing algorithm to define a venous thromboembolism phenotype.

Deep venous thrombosis and pulmonary embolism are diseases associated with significant morbidity and mortality. Known risk factors are attributed for only slight majority of venous thromboembolic disease (VTE) with the remainder of risk presumably related to unidentified genetic factors. We designed a general purpose Natural Language (NLP) algorithm to retrospectively capture both acute and historical cases of thromboembolic disease in a de-identified electronic health record.

Validation and enhancement of a computable medication indication resource (MEDI) using a large practice-based dataset.

Linking medications with their indications is important for clinical care and research. We have recently developed a freely-available, computable medication-indication resource, called MEDI, which links RxNorm medications to indications mapped to ICD9 codes. In this paper, we identified the medications and diagnoses for 1.3 million individuals at Vanderbilt University Medical Center to evaluate the medication coverage of MEDI and then to calculate the prevalence for each indication for each medication. Our results demonstrated MEDI covered 97.3% of medications recorded in medical records.

Syntactic parsing of clinical text: guideline and corpus development with handling ill-formed sentences.

To develop, evaluate, and share: (1) syntactic parsing guidelines for clinical text, with a new approach to handling ill-formed sentences; and (2) a clinical Treebank annotated according to the guidelines. To document the process and findings for readers with similar interest.

Parsing clinical text: how good are the state-of-the-art parsers?

Parsing, which generates a syntactic structure of a sentence (a parse tree), is a critical component of natural language processing (NLP) research in any domain including medicine. Although parsers developed in the general English domain, such as the Stanford parser, have been applied to clinical text, there are no formal evaluations and comparisons of their performance in the medical domain.

Variants near FOXE1 are associated with hypothyroidism and other thyroid conditions: using electronic medical records for genome- and phenome-wide studies.

We repurposed existing genotypes in DNA biobanks across the Electronic Medical Records and Genomics network to perform a genome-wide association study for primary hypothyroidism, the most common thyroid disease. Electronic selection algorithms incorporating billing codes, laboratory values, text queries, and medication records identified 1317 cases and 5053 controls of European ancestry within five electronic medical records (EMRs); the algorithms' positive predictive values were 92.4% and 98.5% for cases and controls, respectively.

Associations of autoantibodies, autoimmune risk alleles, and clinical diagnoses from the electronic medical records in rheumatoid arthritis cases and non-rheumatoid arthritis controls.

The significance of non-rheumatoid arthritis (RA) autoantibodies in patients with RA is unclear. The aim of this study was to assess associations of autoantibodies with autoimmune risk alleles and with clinical diagnoses from the electronic medical records (EMRs) among RA cases and non-RA controls.

Using natural language processing to provide personalized learning opportunities from trainee clinical notes.

Assessment of medical trainee learning through pre-defined competencies is now commonplace in schools of medicine. We describe a novel electronic advisor system using natural language processing (NLP) to identify two geriatric medicine competencies from medical student clinical notes in the electronic medical record: advance directives (AD) and altered mental status (AMS).

A genome-wide association study identifies variants in KCNIP4 associated with ACE inhibitor-induced cough.

  • Mosley JD, Shaffer CM, Van Driest SL, Weeke PE, Wells QS, Karnes JH, Velez Edwards DR, Wei WQ, Teixeira PL, Bastarache L, Crawford DC, Li R, Manolio TA, Bottinger EP, McCarty CA, Linneman JG, Brilliant MH, Pacheco JA, Thompson W, Chisholm RL, Jarvik GP, Crosslin DR, Carrell DS, Baldwin E, Ralston J, Larson EB, Grafton J, Scrol A, Jouni H, Kullo IJ, Tromp G, Borthwick KM, Kuivaniemi H, Carey DJ, Ritchie MD, Bradford Y, Verma SS, Chute CG, Veluchamy A, Siddiqui MK, Palmer CN, Doney A, MahmoudPour SH, Maitland-van der Zee AH, Morris AD, Denny JC, Roden DM. A genome-wide association study identifies variants in KCNIP4 associated with ACE inhibitor-induced cough. The pharmacogenomics journal. 2015 Jul 14. PMID: 26169577 [PubMed]

The most common side effect of angiotensin-converting enzyme inhibitor (ACEi) drugs is cough. We conducted a genome-wide association study (GWAS) of ACEi-induced cough among 7080 subjects of diverse ancestries in the Electronic Medical Records and Genomics (eMERGE) network. Cases were subjects diagnosed with ACEi-induced cough. Controls were subjects with at least 6 months of ACEi use and no cough. A GWAS (1595 cases and 5485 controls) identified associations on chromosome 4 in an intron of KCNIP4.

A Preliminary Study of Clinical Abbreviation Disambiguation in Real Time.

To save time, healthcare providers frequently use abbreviations while authoring clinical documents. Nevertheless, abbreviations that authors deem unambiguous often confuse other readers, including clinicians, patients, and natural language processing (NLP) systems. Most current clinical NLP systems "post-process" notes long after clinicians enter them into electronic health record systems (EHRs). Such post-processing cannot guarantee 100% accuracy in abbreviation identification and disambiguation, since multiple alternative interpretations exist.