Informatics | Center for Precision Medicine

Natural language processing improves identification of colorectal cancer testing in the electronic medical record.

Denny JC, Choma NN, Peterson JF, Miller RA, Bastarache L, Li M, Peterson NB. Natural language processing improves identification of colorectal cancer testing in the electronic medical record. Medical decision making : an international journal of the Society for Medical Decision Making. 32(32). 188-97. PMID: 21393557 [PubMed]

Difficulty identifying patients in need of colorectal cancer (CRC) screening contributes to low screening rates.

Development and evaluation of an ensemble resource linking medications to their indications.

Wei WQ, Cronin RM, Xu H, Lasko TA, Bastarache L, Denny JC. Development and evaluation of an ensemble resource linking medications to their indications. Journal of the American Medical Informatics Association : JAMIA. 20(20). 954-61. PMID: 23576672 [PubMed] PMCID: PMC3756263

To create a computable MEDication Indication resource (MEDI) to support primary and secondary use of electronic medical records (EMRs).

The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future.

Gottesman O, Kuivaniemi H, Tromp G, Faucett WA, Li R, Manolio TA, Sanderson SC, Kannry J, Zinberg R, Basford MA, Brilliant M, Carey DJ, Chisholm RL, Chute CG, Connolly JJ, Crosslin D, Denny JC, Gallego CJ, Haines JL, Hakonarson H, Harley J, Jarvik GP, Kohane I, Kullo IJ, Larson EB, McCarty C, Ritchie MD, Roden DM, Smith ME, Böttinger EP, Williams MS. The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future. Genetics in medicine : official journal of the American College of Medical Genetics. 2013 Oct;15(15). 761-71. PMID: 23743551 [PubMed] PMCID: PMC3795928 NIHMSID: NIHMS495335.

The Electronic Medical Records and Genomics Network is a National Human Genome Research Institute-funded consortium engaged in the development of methods and best practices for using the electronic medical record as a tool for genomic research. Now in its sixth year and second funding cycle, and comprising nine research groups and a coordinating center, the network has played a major role in validating the concept that clinical data derived from electronic medical records can be used successfully for genomic research.

Development of an ensemble resource linking MEDications to their Indications (MEDI).

Wei WQ, Cronin RM, Xu H, Lasko TA, Bastarache L, Denny JC. Development of an ensemble resource linking MEDications to their Indications (MEDI). AMIA Joint Summits on Translational Science proceedings AMIA Summit on Translational Science. 2013(2013). 172 p. PMID: 24303333 [PubMed]

Understanding of medications-disease relationships is critical to distinguish indications from adverse effects, and medication exposures serve as important markers of disease and severity in electronic medical records (EMR). We created a computable medication-indication (MEDI) resource by applying natural language processing and ontology relationships to four public medication resources. Physicians evaluated accuracy of medication-indication relationships.

Validation and enhancement of a computable medication indication resource (MEDI) using a large practice-based dataset.

Wei WQ, Mosley JD, Bastarache L, Denny JC. Validation and enhancement of a computable medication indication resource (MEDI) using a large practice-based dataset. AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium. 2013(2013). 1448-56. PMID: 24551419 [PubMed] PMCID: PMC3900157

Linking medications with their indications is important for clinical care and research. We have recently developed a freely-available, computable medication-indication resource, called MEDI, which links RxNorm medications to indications mapped to ICD9 codes. In this paper, we identified the medications and diagnoses for 1.3 million individuals at Vanderbilt University Medical Center to evaluate the medication coverage of MEDI and then to calculate the prevalence for each indication for each medication. Our results demonstrated MEDI covered 97.3% of medications recorded in medical records.

Electronic health record design and implementation for pharmacogenomics: a local perspective.

Peterson JF, Bowton E, Field JR, Beller M, Mitchell J, Schildcrout J, Gregg W, Johnson K, Jirjis JN, Roden DM, Pulley JM, Denny JC. Electronic health record design and implementation for pharmacogenomics: a local perspective. Genetics in medicine : official journal of the American College of Medical Genetics. 2013 Oct;15(15). 833-41. PMID: 24009000 [PubMed] PMCID: PMC3925979 NIHMSID: NIHMS546673.

The design of electronic health records to translate genomic medicine into clinical care is crucial to successful introduction of new genomic services, yet there are few published guides to implementation.

Integrating EMR-linked and in vivo functional genetic data to identify new genotype-phenotype associations.

Mosley JD, Van Driest SL, Weeke PE, Delaney JT, Wells QS, Bastarache L, Roden DM, Denny JC. Integrating EMR-linked and in vivo functional genetic data to identify new genotype-phenotype associations. PloS one. 9(9). e100322. PMID: 24949630 [PubMed] PMCID: PMC4065041

The coupling of electronic medical records (EMR) with genetic data has created the potential for implementing reverse genetic approaches in humans, whereby the function of a gene is inferred from the shared pattern of morbidity among homozygotes of a genetic variant. We explored the feasibility of this approach to identify phenotypes associated with low frequency variants using Vanderbilt's EMR-based BioVU resource. We analyzed 1,658 low frequency non-synonymous SNPs (nsSNPs) with a minor allele frequency (MAF)<10% collected on 8,546 subjects.

Parsing clinical text: how good are the state-of-the-art parsers?

Jiang M, Huang Y, Fan JW, Tang B, Denny J, Xu H. Parsing clinical text: how good are the state-of-the-art parsers? BMC medical informatics and decision making. 15 Suppl 1(15 Suppl 1). S2. PMID: 26045009 [PubMed] PMCID: PMC4460747

Parsing, which generates a syntactic structure of a sentence (a parse tree), is a critical component of natural language processing (NLP) research in any domain including medicine. Although parsers developed in the general English domain, such as the Stanford parser, have been applied to clinical text, there are no formal evaluations and comparisons of their performance in the medical domain.

Size matters: how population size influences genotype-phenotype association studies in anonymized data.

Heatherly R, Denny JC, Haines JL, Roden DM, Malin BA. Size matters: how population size influences genotype-phenotype association studies in anonymized data. Journal of biomedical informatics. 2014 Dec;52(52). 243-50. PMID: 25038554 [PubMed] PMCID: PMC4260994 NIHMSID: NIHMS616937.

Electronic medical records (EMRs) data is increasingly incorporated into genome-phenome association studies. Investigators hope to share data, but there are concerns it may be "re-identified" through the exploitation of various features, such as combinations of standardized clinical codes. Formal anonymization algorithms (e.g., k-anonymization) can prevent such violations, but prior studies suggest that the size of the population available for anonymization may influence the utility of the resulting data.

Limestone: high-throughput candidate phenotype generation via tensor factorization.

Ho JC, Ghosh J, Steinhubl SR, Stewart WF, Denny JC, Malin BA, Sun J. Limestone: high-throughput candidate phenotype generation via tensor factorization. Journal of biomedical informatics. 2014 Dec;52(52). 199-211. PMID: 25038555 [PubMed]

The rapidly increasing availability of electronic health records (EHRs) from multiple heterogeneous sources has spearheaded the adoption of data-driven approaches for improved clinical research, decision making, prognosis, and patient management. Unfortunately, EHR data do not always directly and reliably map to medical concepts that clinical researchers need or use.

RSS: