Skip to main content
Social Sci LibreTexts

3.3: Why isn't there a bigger bruhaha about this problem in the developmental literature?

  • Page ID
    10333
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    The most interesting discussions come when we suggest that, to conduct this lifespan study, all we have to do is use the entire set of behavioral categories in our observational catalog and then we will have the coding system we need to observe at any age. One or two students invariably point out that we can’t just do that. The behaviors that are prototypical aggression at one age—like biting at age 2--- if we see them in older children or adults are a marker of something completely else. A 12-year-old who bites someone is not only showing aggression, they are showing signs of psychopathology. Do we think that preschoolers gossip? And a 20- year-old who punches someone is not at all the same as a 3-year-old who hits—the law even codifies this—the 20-year-old can be arrested for the same behavior that would get a three-yearold a 10-minute time out. So the point is that the same behaviors at different ages are likely to be markers of different constructs.

    So what are our options?

    We usually then consider the alternative: creating behavioral indicators that seem just right (i.e., valid) for our age ranges and then just sticking to those. On the one hand, that seems like an excellent idea, who could be faulted for prioritizing validity? But on the other hand, then how do we trace changes in aggression across age? How do we create estimates of those differential trajectories that developmentalists are always searching for ? Every time we cross a developmental period, we leave our previous behavioral categories behind and fall over a cliff— we lose comparability across time. How can a researcher draw a meaningful line from hitting at age 3 to social exclusion at age 13? It just doesn’t make sense.

    So that’s the developmental measurement equivalence problem, right?

    That’s it in a nutshell. If we measure things in exactly the same way, it can mean that we are measuring something different for people of different ages, so we have a validity problem. However, if we measure things in different ways, we never know if we are measuring the same thing at different ages, so we have a comparability problem.

    So how do developmentalists typically solve this problem?

    Believe it or not, they typically respect the problem of measurement equivalence by staying within the age groups circumscribed by their key measures. They define their age group of interest based on the range of ages for which their key measure is valid. That is, they follow their participants up to the gap created by the new and different measure, and they stop and turn around. By respecting the age-delimited validity of measures, researchers have successively cut up age into narrower and narrower bands. The downside of this strategy is also apparent—if each research group only focuses on a small age range (and you would probably be surprised by how small an age range can be sliced—witness researchers who specialize in studying the second half of the first year of life), then we have inadvertently converted all “developmental analyses” to the cross-sectional sort, in which we have to compare findings for participants of different age groups. Of course, Taken together, this creates the equivalent of a cross-sectional study in which we have used different measures for each of our age-graded sub-samples-- so this makes knitting together our developmental story a very dicey proposition. It is true that, by each of us sticking to our age-valid measures, we are individually safe, but together we have no metric with which to bridge the age groups we have created. And, in fact, we would argue that, partly based on this strategy, so-called developmental research over the last twenty years has devolved into a greater and greater focus on individual differences within smaller and smaller windows. As you can imagine, cumulatively such as strategy prevents the accumulation of any truly developmental information—that is about age changes across developmental periods and different pathways across those periods.

    Why isn't there a bigger bruhaha about this problem in the developmental literature?

    There are probably many reasons, but mostly because researchers have found ways around the problem, without always thinking carefully about the consequences of their solutions. One of the most common strategies, based on the current dominance of cognitivist meta-theories, is, once we enter the age ranges where they easily can be measured via surveys, to reconceptualize our phenomena up to the level of appraisals. So many researchers who are interested in proximal processes, like social support or parent-child interactions or teacherstudent interactions or peer groups, have decided to study them using measures of children’s and adolescents’ appraisals of these proximal processes. These appraisals are everywhere (remember Sameroff’s ubiquitous “representations”?) and researchers reasons for relying on them are often based on pragmatic grounds: Surveys are much more easily administered than observations, and it is not what parents or teachers or peers are actually doing that shape children’s development, but what children and youth perceive or take away from those interactions, as captured by their appraisals, for example, of parental warmth (“My Mom let me know she loves me”) or teacher autonomy support (“My teacher is always telling me what to do”).

    Some of the earliest trends in this direction can be traced to the measurement of social support in adults. In research on stress and coping, it was assumed that social support was a good thing (as is apparent from its label) but it turned out that people who actually utilized social support were not better off; they tended to be worse off, in terms of emotional distress and other indicators of functioning. It turns out that greater utilization of social support is a marker for trouble—it crops up when a person is dealing with more severe stressful events, when a person has fewer psychological resources or is more fragile, and so on. It also turned out that not all social support is that supportive—it can be intrusive or controlling or make recipients feel incompetent (Ryan & Solky, 1996). In response to this pile of surprising findings, researchers turned to the construct of perceptions of the availability of social support. Now there was a measure that was well-behaved—it was positively correlated to well-being and high functioning.

    What’s wrong with studying appraisals instead of actual proximal processes?

    The goals are understandable. Researchers have created developmental comparability by measuring appraisals all right, but in order to exclude the messy bits (the proximal processes) that are changing with age, researchers have also unintentionally excluded the part of our phenomena that is doing most of the heavy lifting. The easiest way to see the problem is to do a thought experiment in which we as researchers have completed our excellent program of study showing the crucial nature of children’s appraisals of the supportiveness of their interactions with (pick one or more) teachers, parents, or peers in shaping, for example, their academic selfperceptions, motivation, engagement, coping, learning, and school success. So now we move from the “explanation” to the “optimization” phase of our research program, and we suddenly realize that we have no advice for teachers, parents, or peers because we have learned nothing about the real day-to-day interactions that shape appraisals. In fact, as mentioned in earlier chapters, if we follow our train of research to its logical conclusion, our interventions would consist in working directly with children and youth themselves and trying to influence their appraisals (since that was our explanatory mechanism)-- maybe by trying to persuade them that their parents and teachers really do like them, even if they don't think so. It is obvious that such interventions would offend our sensibilities: We want to improve the context itself so that children and students are having interactions with their teachers and parents that naturally result in them feeling loved and supported. However, once we get out of our participants’ heads, and back to where we belong, which is amidst proximal processes, we can see that the pesky problem of measurement equivalence is rampant. The kinds of interactions with a teacher than make a first-grader feel cared about are very different from those that would make a fifth- or ninth-grader feel cared about, as any parent knows who has tried to kiss their fifth-grader good-bye when dropping them off at school. So the take-home message of this long story is that the strategies that researchers use to solve the problem of measurement equivalence can inadvertently throw out the proverbial baby with the bath water.

    Are there any good options?

    Yes, let’s start with two technical solutions. The first is to conduct longitudinal studies, and just use the measure that is most valid for each age group as you go (Measure A for the youngest ages, followed by Measure B, and so on). Then, analyses focus on inter-individual cross-time connections between the two measures—the extent to which participants’ position on Measure A at age 1 predicts their position on Measure B at age 2. So these measures are valid but not directly comparable. The good news is that this strategy has allowed us to cross over into new developmental periods—that’s why it is a favorite strategy in long-term longitudinal studies. At the same time, however, its two downsides are (1) when correlations are relatively low or non-significant, it is not possible to determine whether this is due to discontinuity in the phenomena or to lack of comparability in the measures; and (2) it produces no information about intra-individual change in the level of the phenomena (i.e., trajectories) or different pathways.

    The second strategy is pictured in Figure 21.2: Build a bridge by creating a series of overlapping age groups and measures. This strategy takes advantage of the fact that most measures are not valid for only a single age point, but typically cover a range of ages. This would allow researchers to use the measure that is valid for the younger ages on children when they are younger, then start the measure that is valid for the next age group while continuing to use the previous measure. As a result, researchers would have information from the age groups that are at the “seam” between the two measures on both of them and it should be valid for both. This allows researchers to directly and empirically examine of the comparability of the two measures, as shown in the ovals in Figure 21.2. Of course, the downside of this strategy is the extra work for researchers, and especially for participants, in using multiple measures of the same construct.

    Insert Figure 21.2