Development of a natural language processing system to identify timing and status of colonoscopy testing in electronic medical records.

Abstract

Colorectal cancer (CRC) screening rates are low despite proven benefits. We developed natural language processing (NLP) algorithms to identify temporal expressions and status indicators, such as "patient refused" or "test scheduled." The authors incorporated the algorithms into the KnowledgeMap Concept Identifier system in order to detect references to completed colonoscopies within electronic text. The modified NLP system was evaluated using 200 randomly selected electronic medical records (EMRs) from a primary care population aged >/=50 years. The system detected completed colonoscopies with recall and precision of 0.93 and 0.92. The system was superior to a query of colonoscopy billing codes to determine screening status.