5.4: Descriptive statistics for cardinal data
-
- Last updated
- Save as PDF
Let us turn, finally, to a design with one nominal and one cardinal variable: a test of the third of the three hypotheses introduced at the beginning of this chapter. Again, it is restated here together with the background assumption from which it is derived:
(15) Assumption: Short items tend to occur toward the beginning of a constiutent, long items tend to occur at the end.
Hypothesis: The S -POSSESSIVE will be used with short modifiers, the OF -POSSESSIVE will be used with long modifiers
The constructions are operationalized as before. The data used are based on the same data set as before, except that cases with proper names and pronouns are excluded. The reason for this is that we already know from the first case study that pronouns, which we used as an operational definition of OLD INFORMATION prefer the s -possessive. Since all pronouns are very short (regardless of whether we measure their length in terms of words, syllables or letters), including them would bias our data in favor of the hypothesis. This left 20 cases of the s -possessive and 154 cases of the of -possessive. To get samples of roughly equal size for expository clarity, let us select every sixth case of the of -possessive, giving us 25 cases (note that in a real study, there would be no good reason to create such roughly equal sample sizes – we would simply use all the data we have).
The variable LENGTH was defined operationally as “number of orthographic words”. We can now state the following prediction:
(16) Prediction: The mean length of modifiers of the S -POSSESSIVE should be smaller than that of the modifiers of the OF -POSSESSIVE.
Table 5.9 shows the length of head and modifier for all cases in our sample.
5.4.1 Means
How to calculate a mean (more precisely, an arithmetic mean) should be common knowledge, but for completeness’ sake, the formula is given in (17):
(17)
In other words, in order to calculate the mean of a set of values �� 1 , �� 2 , ..., �� �� of size n, we add up all values and divide them by n (or multiply them by 1/��, which is the same thing).
Since we have stated our hypothesis and the corresponding prediction only in terms of the modifier, we should first make sure that the heads of the two possessives do not differ greatly in length: if they did, any differences we find for the modifiers could simply be related to the fact that one of the constructions may be longer in general than the other. Adding up all 20 values for the s -possessive heads gives us a total of 57, so the mean is 57 / 20 = 2.85. Adding up all 25 values of the of -possessive heads gives us a total of 59, so the mean is 59 / 25 = 2.36. We have, as yet, no way of telling whether this difference could be due to chance, but the two values are so close together that we will assume so for now. In fact, note that there is one obvious outlier (a value that is much bigger than the others: example (a 1) in Table 5.9 has a head that is 14 words long. If we assume that this is somehow exceptional and remove this value, we get a mean length of 43 / 19 = 2.26, which is almost identical to the mean length of the of -possessive’s modifiers.
If we apply the same formula to the modifiers, however, we find that they differ substantially: the mean length of the s -possessive modifiers is 38 / 20 = 1.9, while the mean length of the of -possessive’s modifiers is more than twice as much, namely 112 / 25 = 4.48. Even if we remove the obvious outlier, example (b 18) in Table 5.9, the of -possessive’s modifiers are twice as long as those of the s -possessive, namely 92 / 24 = 3.83.
Table 5.9: A sample of s - and of -possessives annotated for length of head and modifier (BROWN)