Assessing the accuracy of observer-reported ancestry in a biorepository linked to electronic medical records.

The Vanderbilt DNA Databank (BioVU) is a biorepository that currently contains >80,000 DNA samples linked to electronic medical records. Although BioVU is a valuable source of samples and phenotypes for genetic association studies, it is unclear whether the administratively assigned race/ethnicity in BioVU can accurately describe and be used as a proxy for genetic ancestry.

Generating Clinical Notes for Electronic Health Record Systems.

Clinical notes summarize interactions that occur between patients and healthcare providers. With adoption of electronic health record (EHR) and computer-based documentation (CBD) systems, there is a growing emphasis on structuring clinical notes to support reusing data for subsequent tasks. However, clinical documentation remains one of the most challenging areas for EHR system development and adoption. The current manuscript describes the Vanderbilt experience with implementing clinical documentation with an EHR system.

Identification of genomic predictors of atrioventricular conduction: using electronic medical records as a tool for genome science.

Recent genome-wide association studies in which selected community populations are used have identified genomic signals in SCN10A influencing PR duration. The extent to which this can be demonstrated in cohorts derived from electronic medical records is unknown.

Modulators of normal electrocardiographic intervals identified in a large electronic medical record.

Traditional electrocardiographic (ECG) reference ranges were derived from studies in communities or clinical trial populations. The distribution of ECG parameters in a large population presenting to a healthcare system has not been studied.

Data from clinical notes: a perspective on the tension between structure and flexible documentation.

Clinical documentation is central to patient care. The success of electronic health record system adoption may depend on how well such systems support clinical documentation. A major goal of integrating clinical documentation into electronic heath record systems is to generate reusable data. As a result, there has been an emphasis on deploying computer-based documentation systems that prioritize direct structured documentation.

The emerging role of electronic medical records in pharmacogenomics.

Health-care information technology and genotyping technology are both advancing rapidly, creating new opportunities for medical and scientific discovery. The convergence of these two technologies is now facilitating genetic association studies of unprecedented size within the context of routine clinical care. As a result, the medical community will soon be presented with a number of novel opportunities to bring functional genomics to the bedside in the area of pharmacotherapy.

The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies.

The eMERGE (electronic MEdical Records and GEnomics) Network is an NHGRI-supported consortium of five institutions to explore the utility of DNA repositories coupled to Electronic Medical Record (EMR) systems for advancing discovery in genome science. eMERGE also includes a special emphasis on the ethical, legal and social issues related to these endeavors.

Comparing content coverage in medical curriculum to trainee-authored clinical notes.

Accurate assessment and evaluation of medical curricula has long been a goal of medical educators. Current methods rely on manually-entered keywords and trainee-recorded logs of case exposure. In this study, we used natural language processing to compare the clinical content coverage in a four-year medical curriculum to the electronic medical record notes written by clinical trainees. The content coverage was compared for each of 25 agreed-upon core clinical problems (CCPs) and seven categories of infectious diseases. Most CCPs were covered in both corpora.

Anonymization of administrative billing codes with repeated diagnoses through censoring.

Patient-specific data from electronic medical records (EMRs) is increasingly shared in a de-identified form to support research. However, EMRs are susceptible to noise, error, and variation, which can limit their utility for reuse. One way to enhance the utility of EMRs is to record the number of times diagnosis codes are assigned to a patient when this data is shared. This is, however, challenging because releasing such data may be leveraged to compromise patients' identity.

Mining Biomedical Literature for Terms related to Epidemiologic Exposures.

Epidemiologic studies contribute greatly to evidence-based medicine by identifying risk factors for diseases and determining optimal treatments for clinical practice. However, there is very limited effort on automatic extraction of knowledge from epidemiologic articles, such as exposures, outcomes, and their relations. In this initial study, we developed a system that consists of a natural language processing (NLP) engine and a rule-based classifier, to automatically extract exposure-related terms from titles of epidemiologic articles.