Skip to main content
Social Sci LibreTexts

8.1: Introduction

  • Page ID
    129541
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    The ability to identify visual information is important in humans and animals for their survival. When you walk around downtown, you might identify buildings, traffic signs, a car that is approaching to you or faces of people on the street. We effortlessly recognize and classify visually presented objects and faces of people with high-accuracy (Biderman, 1987), even though each object produces tremendous variations in appearance (Logothetis, & Sheingerg, 1996). Such automatic brain process in visual recognition system enables us to build conceptual representation through generalizations of a novel object into an existing category, but also through identifications of similar characteristics from different kinds of things (Grill-Spector et al., 2001). For example, when you see a golden retriever on the street, you can categorize it as a dog and distinguish the differences between a poodle and it and their shared features such as having four legs or a tail. Although there are various functions of our vision (DiCarlo, & Cox, 2007), in the present review, recognition will be referred to a task including both identification and categorization: identification in which one can recognize a specific object or face among others, and categorization in which one can recognize a dog among other object classes (Poggio, & Ullman, 2013).

    Although computerized recognition systems not completely duplicate individuals’ recognition performances, the studying of such artificial models contribute to understanding in process on human visual recognition systems (Pinto et al., 2008). A large body of literature has been interested in determining whether or not the computational approaches reproduce a realistic theory of human/animal object recognition. Since visual performance in the brain is attributed by more than 50 percent of the neocortex (Felleman, & Van Essen, 1991), it is not surprising to be difficult to emulate this ability in computational methods. Early computational approaches focused primarily on recognizing three-dimensional (3D) objects, including artifacts (e.g., buildings, tables, and automobiles), animals and human faces. The main problem of such computational methods is that representations of objects are two-dimensionally produced on the retina at first, even if we recognize them as 3D images with different visual variations depending on its pose and lighting (Ullman, 1996). More recently developed computational models, especially including inspired by brain-based approaches, enable to recognize meaningful patterns on object (e.g., Lazebnik, Schmid, & Ponce, 2006; Mutch, & Lowe, 2006; Wang, Zhang, & Fei-Fei, 2006; Zhang, Berg, & Malik, 2006).

    A natural way to understand this general theme is to first try to review the basic capacities of the primate recognition system. After a brief description of some general principles of object recognition, this paper explores specific findings of effects and phenomena in the object recognition literatures. Then, this article will discuss whether or not each computational pattern classification theory can explain the phenomena. To that end, the current paper will investigate a possible solution motivated by the ventral visual stream in the brain to deal with the challenges of object recognition in modern computational models.


    This page titled 8.1: Introduction is shared under a not declared license and was authored, remixed, and/or curated by Matthew J. C. Crump via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.