11.6: Intelligence Testing - The What, the Why, and the Who

  • Measuring Intelligence: Standardization and the Intelligence Quotient

    The goal of most intelligence tests is to measure “g”, the general intelligence factor. Good intelligence tests are reliable, meaning that they are consistent over time, and also demonstrate validity, meaning that they actually measure intelligence rather than something else. Because intelligence is such an important part of individual differences, psychologists have invested substantial effort in creating and improving measures of intelligence, and these tests are now considered the most accurate of all psychological tests.

    Intelligence changes with age. A 3-year-old who could accurately multiply 183 by 39 would certainly be intelligent, but a 25-year-old who could not do so would be seen as unintelligent. Thus understanding intelligence requires that we know the norms or standards in a given population of people at a given age. The standardization of a test involves giving it to a large number of people at different ages and computing the average score on the test at each age level.

    Once the standardization has been accomplished, we have a picture of the average abilities of people at different ages and can calculate a person’s mental age, which is the age at which a person is performing intellectually. If we compare the mental age of a person to the person’s chronological age, the result is the Intelligence Quotient (IQ), a measure of intelligence that is adjusted for age. A simple way to calculate IQ is by using the following formula:

    IQ = mental age ÷ chronological age × 100.

    Thus a 10-year-old child who does as well as the average 10-year-old child has an IQ of 100 (10 ÷ 10 × 100), whereas an 8-year-old child who does as well as the average 10-year-old child would have an IQ of 125 (10 ÷ 8 × 100). Most modern intelligence tests are based on the relative position of a person’s score among people of the same age, rather than on the basis of this formula, but the idea of intelligence “ratio” or “quotient” provides a good description of the score’s meaning.

    The Flynn Effect

    It is important that intelligence tests be standardized on a regular basis, because the overall level of intelligence in a population may change over time. The Flynn effect refers to the observation that scores on intelligence tests worldwide have increased substantially over the past decades (Flynn, 1999). Although the increase varies somewhat from country to country, the average increase is about 3 IQ points every 10 years. There are many explanations for the Flynn effect, including better nutrition, increased access to information, and more familiarity with multiple-choice tests (Neisser, 1998). But whether people are actually getting smarter is debatable (Neisser,1997). 33

    The Value of IQ Testing

    The value of IQ testing is most evident in educational or clinical settings. Children who seem to be experiencing learning difficulties or severe behavioral problems can be tested to ascertain whether the child’s difficulties can be partly attributed to an IQ score that is significantly different from the mean for her age group. Without IQ testing—or another measure of intelligence—children and adults needing extra support might not be identified effectively. People also use IQ testing results to seek disability benefits from the Social Security Administration.

    While IQ tests have sometimes been used as arguments in support of insidious purposes, such as the eugenics movement, which was the science of improving a human population by controlled breeding to increase desirable heritable characteristics. However, the value of this test is important to help those in need.34

    Intelligence Tests and Those Who Created Them

    Alfred Binet & Théodore Simon - Stanford- Binet Intelligence Test

    From 1904- 1905 the French psychologist Alfred Binet (1857–1914) and his colleague Théodore Simon (1872–1961) began working on behalf of the French government to develop a measure that would identify children who would not be successful with the regular school curriculum. The goal was to help teachers better educate these students (Aiken, 1994).

    Binet and Simon developed what most psychologists today regard as the first intelligence test, which consisted of a wide variety of questions that included the ability to name objects, define words, draw pictures, complete sentences, compare items, and construct sentences. Binet and Simon (Binet, Simon, & Town, 1915; Siegler, 1992) believed that the questions they asked the children all assessed the basic abilities to understand, reason, and make judgments.

    Figure \(\PageIndex{1}\): Alfred Binet (b) This page is from a 1908 version of the Binet-Simon Intelligence Scale. Children being tested were asked which face, of each pair, was prettier. (Images are in the public domain)

    Soon after Binet and Simon introduced their test, the American psychologist Lewis Terman at Stanford University (1877–1956) developed an American version of Binet’s test that became known as the Stanford- Binet Intelligence Test. The Stanford-Binet is a measure of general intelligence made up of a wide variety of tasks including vocabulary, memory for pictures, naming of familiar objects, repeating sentences, and following commands.36

    David Wechsler- Wechsler-Bellevue Intelligence Scale

    In 1939, David Wechsler, a psychologist who spent part of his career working with World War I veterans, developed a new IQ test in the United States. Wechsler combined several subtests from other intelligence tests used between 1880 and World War I. These subtests tapped into a variety of verbal and nonverbal skills, because Wechsler believed that intelligence encompassed “the global capacity of a person to act purposefully, to think rationally, and to deal effectively with his environment” (Wechsler, 1958, p. 7). He named the test the Wechsler-Bellevue Intelligence Scale (Wechsler, 1981). This combination of subtests became one of the most extensively used intelligence tests in the history of psychology.

    Figure \(\PageIndex{2}\): David Wechsler (Image by Comet Photo AG (Zürich) is licensed under CC BY-SA 4.0)

    Today, there are three intelligence tests credited to Wechsler, the Wechsler Adult Intelligence Scale-fourth edition (WAIS-IV), the Wechsler Intelligence Scale for Children (WISC-V), and the Wechsler Preschool and Primary Scale of Intelligence—Revised (WPPSI-III) (Wechsler, 2002). These tests are used widely in schools and communities throughout the United States, and they are periodically normed and standardized as a means of recalibration.

    Bias of IQ Testing

    Intelligence tests and psychological definitions of intelligence have been heavily criticized since the 1970s for being biased in favor of Anglo-American, middle-class respondents and for being inadequate tools for measuring non-academic types of intelligence or talent. Intelligence changes with experience, and intelligence quotients or scores do not reflect that ability to change. What is considered smart varies culturally as well, and most intelligence tests do not take this variation into account. For example, in the West, being smart is associated with being quick. A person who answers a question the fastest is seen as the smartest, but in some cultures being smart is associated with considering an idea thoroughly before giving an answer. A well- thought out, contemplative answer is the best answer.38

    A Spectrum of Intellectual Development

    The results of studies assessing the measurement of intelligence show that IQ is distributed in the population in the form of a Normal Distribution (or bell curve), which is the pattern of scores usually observed in a variable that clusters around its average. In a normal distribution, the bulk of the scores fall toward the middle, with many fewer scores falling at the extremes. The normal distribution of intelligence shows that on IQ tests, as well as on most other measures, the majority of people cluster around the average (in this case, where IQ = 100), and fewer are either very smart or very dull (see below).

    Figure \(\PageIndex{3}\): The majority of people have an IQ score between 85 and 115. (Image by CNX Psychology is licensed under CC BY 4.0)

    Distribution of IQ Scores in the General Population

    This means that about 2% of people score above an IQ of 130, often considered the threshold for giftedness, and about the same percentage score below an IQ of 70, often being considered the threshold for an intellectual disability.

    Intellectual Disabilities

    One end of the distribution of intelligence scores is defined by people with very low IQ. Intellectual disability (or intellectual developmental disorder) is assessed based on cognitive capacity (IQ) and adaptive functioning. The severity of the disability is based on adaptive functioning, or how well the person handles everyday life tasks. About 1% of the United States population, most of them males, fulfill the criteria for intellectual developmental disorder, but some children who are given this diagnosis lose the classification as they get older and better learn to function in society. A particular vulnerability of people with low IQ is that they may be taken advantage of by others, and this is an important aspect of the definition of intellectual developmental disorder (Greenspan, Loughlin, & Black, 2001).

    One example of an intellectual developmental disorder is Down syndrome, a chromosomal disorder caused by the presence of all or part of an extra 21st chromosome. The incidence of Down syndrome is estimated at approximately 1 per 700 births, and the prevalence increases as the mother’s age increases (CDC, 2014a). People with Down syndrome typically exhibit a distinctive pattern of physical features, including a flat nose, upwardly slanted eye, a protruding tongue, and a short neck.

    Figure \(\PageIndex{4}\): Down Syndrome is caused by the presence of all or part of an extra 21st chromosome. (Image by Vanellus Foto is licensed under CC BY-SA 3.0)

    Fortunately, societal attitudes toward individuals with intellectual disabilities have changed over the past decades. We no longer use terms such as “retarded,” “moron,” “idiot,” or “imbecile” to describe people with intellectual differences, although these were the official psychological terms used to describe degrees of what was referred to as mental retardation in the past. Laws such as the Americans with Disabilities Act (ADA) have made it illegal to discriminate on the basis of mental and physical disability.

    The normal distribution of IQ scores in the general population shows that most people have about average intelligence, while very few have extremely high or extremely low intelligence.41


    Being gifted refers to children who have an IQ of 130 or higher (Lally & Valentine-French, 2015). Having an extremely high IQ is clearly less of a problem than having an extremely low IQ but there may also be challenges to being particularly smart. It is often assumed that school children who are labeled as “gifted” may have adjustment problems that make it more difficult for them to create and maintain social relationships.

    Figure \(\PageIndex{5}\): Children who get a score on an intelligence test showing an IQ of 130 or higher are labeled as gifted. (Image by Ben Mullins on Unsplash)

    As you might expect based on our discussion of intelligence, there are also different types and areas of intelligence and giftedness. Some children are particularly good at math or science, some at automobile repair or carpentry, some at music or art, some at sports or leadership, and so on. There is a lively debate among scholars about whether it is appropriate or beneficial to label some children as “gifted and talented” in school and to provide them with accelerated special classes and other programs that are not available to everyone. Although doing so may help the gifted kids (Colangelo & Assouline, 2009), it also may isolate them from their peers and make such provisions unavailable to those who are not classified as “gifted.” Testing for high IQ or for disabilities needs to be critically looked at so that the good that these tests were created for are not used for undesirable purposes.43

    How do we know so much about what children learn in schools? In the next section we’ll look at the different types of tests and what the schools are testing.

    Testing in Schools

    Children's academic performance is often measured with the use of standardized tests. Those tests include, but are not limited to Achievement and Aptitude tests.

    Figure \(\PageIndex{6}\): Standardized tests are used to measure academic performance. (Image by Marine Corps Base Hawaii is in the public domain)

    Achievement tests are used to measure what a child has already learned. Achievement tests are often used as measures of teaching effectiveness within a school setting and as a method to make schools that receive tax dollars (such as public schools, charter schools, and private schools that receive vouchers) accountable to the government for their performance.

    Aptitude tests are designed to measure a student’s ability to learn or to determine if a person has potential in a particular program. These are often used at the beginning of a course of study or as part of college entrance requirements. The Scholastic Aptitude Test (SAT) and Preliminary Scholastic Aptitude Test (PSAT) are perhaps the most familiar aptitude tests to students in grades 6 and above. Learning test taking skills and preparing for SATs has become part of the training that some students in these grades receive as part of their pre-college preparation. Other aptitude tests include the MCAT (Medical College Admission Test), the LSAT (Law School Admission Test), and the GRE (Graduate Record Examination). Intelligence tests are also a form of aptitude test, which designed to measure a person’s ability to learn.45

    What Happened to No Child Left Behind?

    In 2001, President Bush signed into effect Public Law 107-110, better known as the No Child Left Behind Act mandating that schools administer achievement tests to students and publish those results so that parents have an idea of their children's performance. Additionally, the government would have information on the gaps in educational achievement between children from various social class, racial, and ethnic groups.

    Schools that showed significant gaps in these levels of performance were mandated to work toward narrowing these gaps. Educators criticized the policy for focusing too much on testing as the only indication of student performance. Target goals were considered unrealistic and set by the federal government rather than individual states. Because these requirements became increasingly unworkable for schools, changes to the law were requested.

    Figure \(\PageIndex{7}\): The No Child Left Behind Act was signed into effect in 2001. (Image is in the public domain)
    Figure \(\PageIndex{8}\): The Every Student Succeeds Act was signed into effect in 2015. (Image is in the public domain)

    On December 12, 2015 President Obama signed into law the Every Student Succeeds Act (ESSA). This law is state-driven and focuses on expanding educational opportunities and improving student outcomes, including in the areas of high school graduation, drop-out rates, and college attendance.48

