Skip to main content
Social Sci LibreTexts

2.2: Introduction

  • Page ID
    129492
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    “Big data” is a popular buzzword thrown into seemingly every conversation about data mining technology. Simply put, “big data” is just massive, sometimes unstructured, data sources collected from many odds and ends but most typically from the internet. Much time and effort go into finding effective and efficient ways to make sense of this data, specifically ways to accurately classify it at minimal computational expense. Computational classification techniques have been around long before big data was a buzzword—and some existed even before the age of the internet. These classification techniques can be supervised or unsupervised, meaning that they learn from human-annotated data, or they learn on their own without annotation. While supervised learning tends to yield superior results, the amount of human time and effort put into annotating is extremely expensive and not always practical. Especially when considering the popularity of big data, few researchers have the time to put into annotating and therefore unsupervised learning has become more popular in recent years. Slowly but surely, unsupervised learning is evolving and becoming more and more accurate.

    One field that would benefit greatly from computational classification techniques is the biomedical domain. Medical resources are rich in information and many of them are available publicly, such as medical journals published on PubMed and Medline, two online databases for medical and clinical scholarly texts. However, clinical data is much more difficult to come by given issues with privacy and consent. Despite this, there are publicly available datasets which are large enough to construct more complicated systems, such as neural networks. There is a true need for research in biomedical and clinical texts not only as a practical task for computational classification, but a life saving one too. Artificial intelligence systems such as IBM Watson use classification in tasks to help physicians provide top care for their patients. It would be humanly impossible to read and retain everything published in any medical subfield, but Watson can analyze thousands of documents daily. Not only can Watson maintain and update a database, but it can also help provide patients with better care by keeping doctors updated on relevant findings that may help that person.

    Computational classification is also useful on the clerical side of the medical field, including bettering the medical billing and coding system. Currently, medical billers and coders work with specialized vocabularies to properly annotate clinical charts to bill insurance companies. While this system is necessary, medical billers and coders are only human and can make mistakes. Automated systems for medical billing and coding has been experimented with and continues to be a difficult task to execute as well as trained professionals can. In this paper, the argument that the medical field would most definitely benefit from more computational classification applications will be made, including reviewing relevant previous general work in these algorithms, as well as their application to the medical field to date.


    This page titled 2.2: Introduction is shared under a not declared license and was authored, remixed, and/or curated by Matthew J. C. Crump via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.

    • Was this article helpful?