2.5: Clerical Applications

Last updated
Save as PDF

Page ID: 129495

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Computational classification can also be applied to clerical work in the medical field. Currently, medical billers and coders are used to review doctor’s notes and patient charts, annotate them for relevant information, and send that information to insurance companies for the proper billing. Medical billers and coders are skilled people who are trained in vocabularies of the Unified Medical Language System (UMLS), which includes the International Classification of Diseases (ICD) and Current Procedure Terminology (CPT), among others. These vocabularies are used to unambiguously identify various medical concepts for insurance companies to know exactly what their customer visited the doctor for, so they can be billed accordingly. While medical billers and coders are necessary to the modern healthcare system, they can make mistakes in their annotations, and efficiency and always be improved.

Researchers have been applying computational classification methods to medical billing and coding problems to create systems to automate these tasks. Karimi et al have experimented with such tasks in their paper “Automatic Diagnosis Coding of Radiology Reports: A Comparison of Deep Learning and Conventional Classification Methods”. Their system seeks to apply deep learning to auto-coding of radiology reports using the ICD vocabulary and to see how deep learning fares with a smaller data set. Interestingly, they chose to use both domain-specific and out-of-domain data for their training sets. The in-domain set was ICD9, a data set of radiology reports, and the out-of-domain set was IMDB, a movie review data set for sentiment analysis. Their best deep learning neural network classification result was comparable to that of the SVM and logistic regression classifiers also used in this experiment. Automatic billing and coding classification systems have been attempted by other researchers, including Pestian et al in their paper “A Shared Task Involving Multi-label Classification of Clinical Free Text”. In this experiment, Pestian et al sought to use ICD codes to annotate clinical data, much like a medical biller and coder would manually, but instead they used classification techniques and text extraction techniques to pull relevant information from the clinical data and label it with an ICD code, then store it in XML schema. This approach was reasonably successful, but still not as accurate as a manual coder overall.

Another popular medical vocabulary from the UMLS often used in computational classification are Concept Unique Identifiers, or CUIs. Mentioned earlier in this paper, CUI codes are used to map unique concepts to specific codes describing them. CUIs are meant to eliminate ambiguity amongst terms, especially abbreviations and acronyms. CUI codes are used in medical billing and coding, but they are more widely used in medical journal databases such as PubMed and Medline. Jimeno-Yepes et al created a test data set from medical subject headings (MeSH) of Medline articles to extract 203 ambiguous terms, including abbreviations and acronyms. All 203 terms had at least two CUI codes that they could be associated with and the proper term depended on the context. For example, the ambiguous acronym “AA” had two associated CUI codes, one for “alcoholics anonymous” and another for “amino acid”. Jimeno-Yepes et al then found approximately 200 Medline abstracts for each ambiguous term and tagged them with the proper CUI code. They called their final data set “MSH-WSD”, standing for MeSH word sense disambiguation. The MSH-WSD data set is now a popular test dataset in the biomedical text normalization community.

A similar application of classifying ambiguous acronyms in the UMLS was studied by Liu et al in their paper “A Study of Abbreviations in the UMLS”. Liu et al took advantage of the typical format in many papers where an abbreviation was introduced with its expanded form in parentheses next to it. Using this method, they were able to extract 163,666 unique abbreviations and their full forms from the UMLS with a precision of 97.5% and a recall of 96%. About 33% of the abbreviations extracted were ambiguous with six or less characters and multiple possible meanings. This method for extracting abbreviations and full forms of terms has been applied by other researchers, including Jimeno-Yepes et al, where they created the MSH-WSD data set.

Clerical tasks are not limited to medical billing and coding in the clinical field; patients often answer various questions regarding their chart and family history at their doctor’s visit. Llanos et al proposed an automatic classification system of doctor-patient questions where they focused on such questions that would need to be looked up in a chart, such as “do you cough every day?” or “are your parents still alive?”. Questions were classified as rule-based, for example, “do you cough every day?” has the semantic annotation of “symptom” and “frequency”. To test question understanding in hopes of being able to eventually give responses, Llanos et al used a linear SVM and two Naïve Bayes classifiers (Multinomial and Gaussian). Their results across all classifiers were comparable, ranging between 65%-87%. Such applications of classifying questions and answers can help for future applications, such as remote question-answering where a patient has a question, but their physician isn’t available.