Automated extraction of features from free text radiology reports for use in developing appropriateness criteria for radiologic examinations

Author: Daniel Behar

Problem: Since the introduction of the newer CT scanners in the early 80's, there has been a sharp increase in the number of X-Ray examinations done annually in the United States. With that increase comes an inevitable increase in inappropriate studies and the unnecessary exposure of the general population to ionizing radiation. Clinical Decision Making tools, like The American College of Radiology’s Appropriateness Criteria®, are one way of reducing inappropriate studies. Creating appropriateness criteria is a labor intensive, time consuming process involving a multidisciplinary team of topic experts.

Idea: Medical Data Mining and Machine Learning can be used to develop a more extensive range of appropriateness criteria. Artificial intelligence can be used to leverage the expertise inherent in the vast amounts of data available in radiology reports. Natural Language Processing can be used to extract features related to clinical information, study type, findings, and diagnosis from a large corpus of free text Radiology reports. The "usefulness" of each report can be evaluated by looking at the certainty of the terms in the findings and diagnosis, and how closely the report relates to the clinical question. After subsetting the data set by usefulness, each subset can be clustered by clinical findings. The studies performed in each cluster can then be sorted by frequency. This leverages the expertise of the physicians who ordered the tests and the radiology-gatekeepers who authorized the tests by assuming that the tests that are done most often are the most appropriate, especially for the most useful tests. Results can be evaluated by comparing cases with a known standard, like the ACR Appropriateness Criteria®, where they exist. The process is repeated; altering the binning of the clinical clusters, methods of computing the usefulness score, and the function for combining the usefulness and frequency to produce an appropriateness score, until convergence. These data can be used to develop a neural network for evaluating study appropriateness.

Summary: Artificial intelligence can be used to leverage the inherent expertise in the vast corpus of free text radiology reports to develop a just-in-time clinical decision tool to reduce inappropriate imaging studies.