Comparison of four simple methods for estimating sexual dimorphism in fossils.код для вставкиСкачать
AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY 94:465476 (1994) Comparison of Four Simple Methods for Estimating Sexual Dimorphism in Fossils J. MICHAEL PLAVCAN Department of Biological Sciences, Uniuersity of Cincinnati, Cincinnati, Ohio 45221 KEY WORDS Sexual dimorphism, Coefficient of variation, Finite mixture analysis, Primates ABSTRACT Estimating sexual dimorphism in skeletal and dental features of fossil species is difficult when the sex of individuals cannot be reliably determined. Several different methods of estimating dimorphism in this situation have been suggested: extrapolation from coefficients of variation, division of a sample about the mean or median into two subsamples which are then treated as males and females, and finite mixture analysis (specifically for estimating the maximum dimorphism that could be present in a unimodal distribution). The accuracy of none of these methods has been thoroughly investigated and compared in a controlled manner. Such analysis is necessary because the accuracy of all methods is potentially affected by fluctuations in either sample size, sex ratio, or the magnitude of intrasexual variability. Computer modeling experiments show that the mean method is the least sensitive to fluctuations in these parameters and generally provides the best estimates of dimorphism. However, no method can accurately estimate low to moderate levels of dimorphism, particularly if intrasexual variability is high and sex ratios are skewed. o 1994 Wiley-Liss, Inc. Sexual dimorphism can be a substantial component of intraspecific morphological variability, and its recognition is important in understanding morphological variation and species recognition in the fossil record (for recent discussions, see Cope, 1993; Kelley, 1993; Martin and Andrews, 1993; Plavcan, 1993; Teaford et al., 1993). Furthermore, among primates sexual dimorphism in canine tooth size and body weight are associated with variation in mating system and intrasexual competition (CluttonBrock et al., 1977; Harvey et al., 1978; Leutenegger and Kelly, 1977; Kay et al., 1988; Plavcan and van Schaik, 1992; Greenfield, 1992). Following this, it has been suggested that the behavior of extinct species can be inferred on the basis of sexual dimorphism (Fleagle et al., 1980; Kay, 1982a,b). For example, in the primate fossil record polygyny has been inferred for Oligocene anthropoids (Fleagle et al., 1980), Siuupithecus (Andrews, 19831, Chinese hominoids (Wu and 0 1994 WILEY-LISS, INC Oxnard, 19831, and Eocene omomyids (Krishtalka et al., 19901, and either monogamy or low-competitionpolygyny in australopithecines (Leutenegger and Shell, 1987; McHenry, 1991). Such inferences of the behavior of extinct species are only as good as the estimate of dimorphism. Estimating dimorphism is a trivial task if the sex of individuals can be identified by discrete, sex-specific morphological characters or if the sample frequency distribution is clearly bimodal (in all cases, though, one must have good evidence that the “dimorphism” is not a product of the mixing of two similar species or geographic variants of a single species). However, accurately estimating dimorphism is no simple Received June 14,1993; accepted February 28,1994. J. Michael Plavcan’s current address is Department of Anatomy, New York College of Osteopathic Medicine, Old Westbury, NY 11568. Address reprint requests there. 466 J.M. PLAVCAN task if males and females cannot be reliably identified, either because sex-specific morC. torquatus phological characters are not preserved or MIF = 1.41 were not present (many skeletal and dental characters-ven canine teeth-are often dimorphic only in size) or because male and female distributions overlap too much. The latter problem is exacerbated by small sample sizes, which confound the recognition of bimodal distributions in dimorphic characC. nictitans ters (Cope, 1989). Dimorphism of up to 28% (one sex larger than the other) can be hidden within unimodal distributions for sample sizes of up to 100 individuals (Godfrey et al., 1993) (Fig. 1). While the canine teeth of many anthropoid primates and their ancestors are considerably more dimorphic than this (allowing the sexing of individual speciC. cephus mens as long as the canine teeth are preserved), cranial and skeletal features of dimorphic primates often fall below this threshold (Godfrey et al., 1993; Godfrey personal communication). Where male and female distributions overlap, some specimens may be identifiable as belonging to one or the other sex, but these will tend to lie at the extremes, producing overestimates of the C. pogonias degree of dimorphism in the population (Kelley, 1993). Such problems with estimatM/F= 1.16 ing sexual dimorphism in extinct taxa due to uncertain sex assignment have received particular attention for hominoids and hom0 inids (Kay, 1982a,b; Kelley, 1993, Oxnard, 90 1987; Leutenegger and Shell, 1987) and 2o Maxillary Canine Buccolingual (mm) most recently subfossil lemurs (Godfrey et al., 1993).Since most extinct species are repFig. 1. Frequency histograms of maxillary canine resented by small samples of fragmentary buccolingual breadth for combined-sex samples of Cerremains, a method of estimating dimor- copithecus pogonias, C . cephus, C. nictitans, and Cercophism that does not rely on visual assess- cebus torquatus illustrating approximate degrees of bimodality associated with different degrees of sexual ments of frequency distributions andlor the dimorphism (MIF). Units are the same in each histoprior knowledge of the sex of individuals is gram. Data from Plavcan (1990). clearly desirable. This would both avoid the problems of uncertain sex assignment and allow estimation of dimorphism using cra- (Fleagle et al., 1980; Kay 1982a,b; Godfrey nial and skeletal remains that are not easily et al., 1993). Such techniques rely on the fact that, as dimorphism increases, combinedsexed. To get around the problem of estimating sex sample variability increases as a funcsexual dimorphism in fossil samples where tion of the separation between male and fethe sex of individuals cannot be reliably de- male means. Division of a sample into termined, several studies have suggested hypothetical male and female subsamples that it is possible to estimate sexual dimor- about either the mean or median of the samphism using techniques that correlate sam- ple (Godfrey et al., 1993) is not often used ple variability with sexual dimorphism but represents the simplest way of estimat- i 1 FOUR METHODS FOR ESTIMATING SEXUAL DIMORPHISM ing dimorphism in a fossil sample. Two more sophisticated methods are extrapolation of dimorphism from coefficients of variation (Fleagle et al., 1980; Kay, 1982a,b),and estimation of maximum dimorphism in unimodal samples with finite mixture analysis (described in Godfrey et al., 1993). These methods have been employed to estimate sexual dimorphism in hominoids (Kay 1982a,b), australopithecines (Leutenegger and Shell, 19871, and subfossil lemurs (Godfrey et al., 1993), all of which present difficulties for estimating sexual dimorphism. Division of a sample into two subsamples by the mean or median (referred to hereafter as the “mean” and “median” techniques following Godfrey et al., 1993) are the simplest techniques for estimating dimorphism. The combined-sex sample is simply divided into two subsamples either about the sample mean or the median, and the ratio of the means of the newly created subsamples represents the estimate of dimorphism. These methods assume that male and female distributions do not o v e r l a p a situation that rarely occurs except when dimorphism is extreme. Criticism also may be leveled at these techniques for arbitrarily creating male and female means, thereby creating the impression of sexual dimorphism when in fact none may be present. Nevertheless, Godfrey et al. (1993)find that the mean method appears to be quite accurate, especially when dimorphism is substantial. Using data from extant species, several studies report that coefficients of variation (CV) from pooled-sex samples are very highly correlated with sexual dimorphism as measured by a ratio of male to female means (Fleagle et al., 1980; Kay 1982a,b; Leutenegger and Shell, 1987). These studies suggest that dimorphism in extinct species can be easily extrapolated from a simple regression equation between CVs and dimorphism. This relation makes intuitive sense. The CV is nothing more than the sample standard deviation divided by the sample mean (usually multiplied by 100 to express the ratio as a percentage). With increasing sexual dimorphism, the difference between male and female means increases, causing a proportional increase in the pooled-sex sample standard deviation. However, this tech- 467 nique has been strongly criticized on the grounds that, for several extant anthropoid species, CVs of dental dimensions from combined-sex samples are not necessarily higher than those of single-sex samples (Martin, 1983; Martin and Andrews, 1984; Vitzthum, 1990). While most of these observations are based on non-dimorphic measurements of teeth, they severely undermine confidence in the technique. Finite mixture analysis (FMA) is described in Godfrey et al. (1993). Unlike the other three techniques, this technique is designed to demonstrate either the lack of dimorphism in a sample or the maximum dimorphism that could be present within a unimodal distribution. Because dimorphism of up to 28% can be hidden within a single unimodal distribution, even for sample sizes as large as 100 (Godfrey et al., 19931, this method could prove very useful. FMA is based on the observation that unimodal distributions can be generated from the mixture of two normal distributions, but only within certain limits. The method quantifies the maximum separation of male and female means that can be contained within a unimodal distribution on the basis of the combined-sex sample range. To briefly summarize the method presented in Godfrey et al. (19931, within a given whole-sample range, a certain number of standard deviation, k,will on average occur. This value will vary depending on sample size and can be looked up in a table provided by Pearson (1932) (also reproduced in Godfrey et al., 1993). The maximum number of subsample (male and female) standard deviations that can be contained in the total sample’s observed range and still produce a unimodal k. The inverse of this distribution is value is equivalent to the percentage of the observed whole-sample range comprising the difference between the mean of the whole sample and either of the subsample means. Therefore, multiplying this result by the observed sample range yields the distance of the whole-sample mean from either subsample mean. This value is added or subtracted from the whole-sample mean to yield the means of the two subsamples, which are then used to calculate the maximum dimorphism that could be contained in the sample. 468 J.M.PLAVCAN All of these methods rely to some extent on the assumption that as dimorphism increases, the pooled-sex sample variability increases in proportion to the difference between the male and female means. Because of this, each method is potentially confounded by fluctuations in sex ratio, small sample sizes, and fluctuations in intrasexual variability. Deviations from a balanced sex ratio should lower pooled-sex sample variability, since the pooled-sex sample mean will be closer to the mean of one sex, and therefore more individuals will be closer to the pooled-sex mean. Small sample sizes increase the likelihood that pooled-sex variation will be influenced by sampling error and increases the likelihood that the sex ratio will be imbalanced (or even that a sample is composed of only one sex!). Finally, since the pooled-sex sample standard deviation is a function of both intrasexual variation and the difference between male and female means, increased intrasexual variability can potentially swamp out variation due to a difference between the male and female means especially when dimorphism is relatively slight. While some analysis of the influence of sample size, imbalance in the sample sex ratio, and high intersexual variability on each method has been provided (Kay, 1982a; Godfrey et al., 19931, these studies have used data from a limited number of extant species and have not actually quantified and compared the error of each method with variation in each of these three factors. Without a direct comparison of the methods, it is difficult to decide under what conditions which, if any, works best. Computer modeling offers an easy way to examine the influence of confounding factors on the accuracy of each method. By demonstrating the advantages and disadvantages of each technique using artificial data, the method likely to yield the best results can then be selected depending on the actual nature of the data that one is likely to encounter in the fossil record. For example, one method may work better than others a t small sample sizes, or may be relatively robust to changes in the ratio of males to females in a sample. This analysis presents the results of a computer modeling experiment which com- TABLE 1. Procedure for generating simulations Step 1: Set the initial population parameters: 1)male and female CVs; 2) female mean Step 2: Set the male mean, based on the female mean, so that dimorphism equals a set value Step 3: Randomly sample a specific number of males and females from the (infinite)population with parameters set in steps 1 and 2 Step 4: Calculate the actual sample dimorphism Step 5: Calculate estimates of dimorphism based on the CV, FMA, mean, and median methods Step 6: Repeat steps 3-5 100 times and then calculate the averages and standard deviations of the dimorphism estimates Step 7: Go back to step 2, recalculating the male mean to achieve a new level of dimorphism, and repeat steps 3-6; do this for as many levels of dimomhism as desired pares the accuracy of these four methods under the influence of variation in sample size, intrasexual variability, and the ratio of males to females in a sample. METHODS The computer model and experimental design Table 1 provides an outline of the analytical procedure. A computer model was used to randomly generate gaussian samples of males and females using the algorithm of Box and Mueller (19581, modified so that sample means and variances could be selected by the user. Generation of these samples was analogous to randomly selecting a subsample from an infinite sample of known mean and variance. Therefore, while each sample generated had a unique mean and variance, the average mean and variance of a large number of such samples were normally distributed about the user-defined mean and variance. For each experiment, a series of 100 samples were generated for 10 levels of dimorphism, rangingfrom 1.0 to 1.9 in increments of 0.1 (expressing the ratio of male to female means-a convention used throughout the text). Thus, for an experiment, 100 samples would be selected from an infinite population with sexual dimorphism of 1.0; then 100 more samples would be selected from an infinite population with sexual dimorphism of 1.1, and so on until a total of 1,000 samples was generated. Because the samples were randomly generated, dimorphism was not FOUR METHODS FOR ESTIMATING SEXUAL DIMORPHISM 469 exactly 1.0, 1.1,etc., but instead the dimor- with this program at different sample sizes phism of the samples was normally distrib- and levels of intrasexual variation. Sample uted around the “true” dimorphism (that is, size was not allowed to randomly fluctuate the dimorphism specified by the user). because this is a known value in the fossil While dimorphism of 1.9 will usually pro- samples. Intrasexual variation was not alduce bimodal distributions with little or no lowed to randomly fluctuate because this overlap between male and female mans (ob- factor is strongly dependent on the variable viating the use of any of the techniques ex- being measured and is thus highly sampleamined here), this range of dimorphism was dependent. For example, teeth are comnecessary to accurately characterize the be- monly known to be much less variable than havior of all four methods, especially the CV other skeletal measurements (Yablakov, method. In fact, interspecific variation in di- 1974; Simpson et al., 1960),while variability morphism in morphological characters is in the canine teeth fluctuates substantially highly variable. Thus, canine dimorphism among species (Gingerich, 1974; Plavcan, often exceeds 1.9 in magnitude, while skull 1990). lengths rarely exceeds about 1.4 (Godfrey, Extrapolating dimorphism with CVs personal communication). For most applications to data in the fossil record, the techBefore performing the simulations, it was niques analyzed here will be useful when necessary to reformulate the way CVs are dimorphism is less than about 1.4 in magni- used to estimate sexual dimorphism. In the tude, though this upper boundary depends past, dimorphism in a fossil sample was estion sample size, sex ratio, and the level of mated by extrapolation from a simple linear intrasexual variability present in a particu- regression between dimorphism and CVs calculated from a limited number of species lar sample. For each sample the program calculated (Kay, 1982a,b; Leutenegger and Shell, the actual sexual dimorphism (&al$Xfemale) 1987). Output from initial experiments and the sexual dimorphism estimated using clearly indicated that the relation between each of the four methods (CV, FMA, mean, the CV and dimorphism is not linear. Ln and median). The means and standard devi- transformation of the estimates of dimorations of the dimorphism estimates were phism produced a linear relation with an then calculated and tabulated for each set of extremely high correlations (Fig. 2a). How100 samples a t each level of true dimor- ever, variation in sex ratio and intrasexual phism. variability produced differences in the relaA number of different experiments were tion between the CV and dimorphism. Imperformed for different sample sizes, sex ra- balances in the sex ratio of the sample protios, and intrasexual variation. For each of duce different slopes in the relation between these three factors, a series of experiments the CV and In-transformed dimorphism was performed holding the other two factors (Fig. 2b). The magnitude of the change in constant. For example, four sets of 1,000 slopes depends both on the degree of imbalsamples corresponding to sample sizes of 10, ance of the sex ratio and on whether more 20, 30, and 40 individuals were generated, males or more females are present in the all with balanced sex ratios and intrasexual sample. With increasing intrasexual variCVs of 5.5. Next, four more such sets were ability, the intercept of the relation between generated, but this time with intrasexual the CV and In-transformed dimorphism is CVs of 7.0. This procedure was repeated for reduced (shifting the scatter of points to the various sex ratios and levels of intrasexual right), and the correlation between the CV variability. In all experiments, the variabil- and In-transformed dimorphism is reduced ity of males and females was kept equal. (increasing the scatter of points) (Fig. 2c). Finally, the program was modified so that The equation used to extrapolate dimorthe sex ratio of each population was ran- phism from CVs obviously depends on the domly selected, mimicking the situation in degree of intrasexual variability and the exthe fossil record where the sex of individuals act sex ratio of the sample. Unfortunately, is unknown. A series of experiments was run neither of these parameters can be known 470 J.M. PLAVCAN c C b for fossil samples requiring the use of this technique to estimate dimorphism. The only practical approach is to calculate the equation assuming a balanced sex ratio and the lowest level of intrasexual variability that can be reasonably expected in the sample. Since most fossil remains are teeth, this analysis used the average variability in the postcanine teeth of primates, which corresponds to a CV or roughly 5.5. With these assumptions, a regression between CVs and In-transformed dimorphism from a computer experiment produces the following equation: Y = -.047 + .0214 * X where Y is In-transformed dimorphism and X i s the combined-sexCV. This equation was used t o calculate CV-based estimates of sexual dimorphism in the experiments. RESULTS Basic comparison (Balanced sex ratio, low intrasexual variability, constant sample size) -0.2 I 0 3 50 Combined-sex CV Fig. 2. Bivariate plots of In-transformed dimorphism vs. combined-sex coefficients of variation (CVs) from random sampling experiments. All experiments were for sample sizes of ten. Bottom panel (a) demonstrates the high correlation and linear relation when sex ratios are balanced and intrasexual variability is low (male and female CVs = 5.5).Note that the cluster of points at the bottom left of each plot is the random scatter generated when dimorphism is 1.00. Middle panel (b) compares output from samples with eight males and two females (higher slope) against samples with two males and eight females (lower slope). As for panel a, male and female CVs were both 5.5. Top panel ( c ) shows the effect of increased sample Variability (male and female CVs = 10.0).Note the increased scatter of points and the larger scatter of the “bulb”reflecting greater intrasexual variability. With balanced sex ratios and low intrasexual variability, the mean and median methods provide the most accurate estimates of sexual dimorphism overall (Table 2). While all four methods tend to overestimate dimorphism when the true dimorphism is low (<1.2), the CV and FMA methods overestimate dimorphism less than either the mean or median methods. When true dimorphism is 1.2 or greater, both the mean and median methods are quite accurate, while the CV method tends to slightly overestimate dimorphism. Importantly, the FMA method begins to substantially underestimate dimorphism when the true dimorphism is greater than 1.1 (under these conditions). At low levels of true dimorphism (less than 1.41, standard deviations of dimorphism estimates from the mean, median, and FMA methods are lower than those of the observed dimorphism, indicating that the estimates of each method are actually less variable than the observed dimorphism (Table 2). At higher levels of true dimorphism (1.4 or greater), the standard devia- 471 FOUR METHODS FOR ESTIMATING SEXUAL. DIMORPHISM TABLE 2. Comparison of the auerages and standard deviations of the actual dimorphism us. estimated dimorphism using the four proposed techniques for populations with balanced sex ratios, male and female CVs of 5.5, and combined-sex sample size of ten True’ dimorphism 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 cv Observed FMA Mean Median Average sd Average sd Average sd Average sd Average sd 1.00 1.10 1.20 1.30 1.40 1.50 1.61 1.70 1.81 1.90 0.035 0.039 0.043 0.044 0.053 0.046 0.055 0.054 0.064 0.058 1.07 1.11 1.20 1.31 1.41 1.51 1.63 1.73 1.85 1.94 0.026 0.039 0.045 0.048 0.057 0.048 0.061 0.057 0.068 0.061 1.08 1.11 1.15 1.20 1.23 1.27 1.32 1.34 1.39 1.41 0.021 0.027 0.028 0.031 0.029 0.028 0.035 0.029 0.037 0.033 1.09 1.13 1.21 1.30 1.40 1.50 1.61 1.70 1.81 1.90 0.021 0.032 0.040 0.043 0.053 0.046 0.055 0.054 0.064 0.058 1.09 1.12 1.20 1.30 1.40 1.50 1.61 1.70 1.81 1.90 0.020 0.031 0.041 0.044 0.053 0.046 0.055 0.054 0.064 0.058 ‘“True”is the level of dimorphism used to generate the populationsand is thus not a mean. “Observed”is the actual dimorphism of the randomly generated populations.All means are based on 100 populationsper level of true dimorphism. tions of the mean and median methods are identical to those of the observed dimorphism. This is because, under the parameters defined in this experiment, there is virtually no overlap in male and female distributions. Finally, standard deviations of the dimorphism estimates from the CV method are slightly higher than those of the observed dimorphism, except when true dimorphism is less than or equal to 1.1. Sample size Standard deviations of dimorphism estimates from all methods decrease substantially with increasing sample sizes. However, no method offers any apparent advantage over the others with increasing sample sizes. lntrasexual variation The magnitude of intrasexual variability is critical to the accuracy of all methods, though overall the mean and median methods are least affected by high intrasexual variability. With low intrasexual variability (male and female CV = 5.5),standard deviations of dimorphism estimates are relatively low at all levels of dimorphism, and both the mean and median methods tend to overestimate dimorphism at low levels of true dimorphism (Table 2). As intrasexual variation increases, each method not only strongly overestimates dimorphism whenthe true dimorphism is low, but also overestimates dimorphism at progressively higher levels of true dimorphism. For example, when male and female population CVs equal 14.0, the mean method overestimates dimorphism even when true dimorphism equals 1.5 (Table 3). With increased intrasexual variability, the CV method also consistently overestimates dimorphism at all levels. This follows from the particular equation used to estimate dimorphism (see above). Were the magnitude of intrasexual variation known, the equation could be adjusted to provide more accurate estimates. When intrasexual variability is high, the FMA method overestimates low levels of dimorphism less than the other methods. At the level of intrasexual variability presented in Table 3, the FMA method begins to actually underestimate dimorphism when the true dimorphism exceeds 1.3. Comparing this to Table 2 where the FMA method begins to underestimate dimorphism when the true dimorphism exceeds 1.1,it is clear that the exact level of dimorphism in which the FMA method begins to underestimate rather than overestimate dimorphism is proportional to the degree of intrasexual variability. As for the basic comparison (Table 2), standard deviations of dimorphism estimates from the mean, median, and FMA methods are lower than those of the observed dimorphism. However, at the higher level of intrasexual variability, standard deviations of the mean and median methods remain lower than those of the observed dimorphism at higher levels of true dimorphism. This is because, with increased intrasexual variability, overlap between male 472 J.M. PLAVCAN TABLE 3. Comparison of the averages and standard deviations of the actual dimorphism us. estimated dimorphism for populations with increased intrasexual variation (Male and female CVs = 14.0, balanced sex ratios, and combined-sex sample size of ten) h e ' dimorphism 1.0 Obsenred Average sd 1.00 1.10 1.20 1.29 1.42 1.51 1.60 1.71 1.80 1.90 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 0.093 0.093 0.095 0.130 0.128 0.139 0.134 0.157 0.168 0.167 cv FMA Average sd Average sd 1.28 1.28 1.36 1.43 1.55 1.64 1.73 1.85 1.93 2.05 0.094 0.094 0.106 0.127 0.148 0.148 0.153 0.181 0.179 0.188 1.22 1.22 1.27 1.31 1.36 1.40 1.44 1.49 1.52 1.56 0.067 0.067 0.074 0.075 0.091 0.084 0.104 0.106 0.108 0.108 Mean sd Average 1.25 1.25 1.30 1.37 1.47 1.53 1.62 1.73 1.80 1.91 0.073 0.076 0.081 0.103 0.118 0.129 0.122 0.152 0.158 0.167 Median Average sd 1.24 1.23 1.29 1.35 1.44 1.52 1.60 1.72 1.80 1-90 0.070 0.069 0.081 0.102 0.115 0.133 0.129 0.155 0.164 0.167 'True" is the level of dimorphism used to generate the populationsand is thus not a mean 'Observed" is the actual dimorphism of the randomly generated populations. All means are based on 100 populationsper level of true dimorphism. TABLE 4. Comparison of the averages and standard deviations of the actual dimorphism us. estimated dimorphism for populations with unbalanced sex ratios (eight males and two females per population, male and female CVs = 5.5) True1 dimorphism 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 Observed Average sd 1.01 1.10 1.21 1.30 1.40 1.50 1.60 1.69 1.80 1.92 0.043 0.051 0.060 0.060 0.059 0.065 0.062 0.072 0.078 0.092 cv FMA Average sd Average sd 1.07 1.09 1.16 1.22 1.28 1.34 1.40 1.45 1.50 1.56 0.030 0.034 0.041 0.040 0.038 0.038 0.039 0.038 0.040 0.043 1.08 1.10 1.14 1.18 1.21 1.23 1.27 1.29 1.31 1.34 0.022 0.025 0.029 0.027 0.025 0.027 0.030 0.027 0.028 0.028 Mean Average sd 1.09 1.11 1.17 1.23 1.31 1.41 1.50 1.59 1.72 1.82 0.025 0.028 0.043 0.062 0.072 0.099 0.114 0.143 0.154 0.176 Median Average sd 1.09 1.10 1.14 1.17 1.20 1.23 1.26 1.28 1.30 1.32 0.024 0.025 0.028 0.027 0.025 0.027 0.031 0.027 0.029 0.033 '"True"is the level of dimorphism used to generate the populationsand is thus not a mean. "Observed"is the actual dimorphism of the randomly generated populations. All means are based on 100 populationsper level of true dimorphism. phism is low and dramatically underestimate dimorphism with increasing true dimorphism, with the FMA and median methods performing worse than the CV method. Interestingly, the CV-based estimates of dimorphism are much more accurate when more females are present in a Sex ratio sample than when more males are present Bias away from a balanced sex ratio, like (Table 5). With more females in a sample, increased intrasexual variation, greatly af- the CV tends to overestimate dimorphism at fects estimates of dimorphism using any of all levels of true dimorphism, and the deviathe methods. However, the mean method is tions of the estimates from the true dimormuch less affected than any other. For ex- phism are not as great as when more males ample, with eight males and two females are in the sample. With unbalanced sex ratios, the standard present in a sample of ten individuals, the deviations of dimorphism estimates from mean method slightly overestimates dimorphism when true dimorphism is low ( G 1.1) the CV, FMA, and median methods are conand slightly underestimates dimorphism sistently lower than those of the observed when true dimorphism is higher (Table 4). dimorphism (Table 4). Standard deviations The CV, FMA, and median methods slightly of dimorphism estimates from the mean overestimate dimorphism when true dimor- method are less than those of the observed and female distributions persists at higher levels of dimorphism. Finally, the standard deviations of dimorphism estimates from the CV method are consistently higher than those of the observed dimorphism a t all levels of true dimorphism. FOUR METHODS FOR ESTIMATING SEXUAL DIMORPHISM TABLE 5. Comparison of the averages of estimates of sexual dimorphism from the CV method when sex ratio is biased toward more males (eight males, two females) and toward more females (eight females, two males) True dimorphism 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 Eight males Average sd 1.07 1.09 1.16 1.22 1.28 1.34 1.40 1.45 1.50 1.56 0.030 0.034 0.041 0.040 0.038 0.038 0.039 0.038 0.040 0.043 Eight females Average sd 1.07 1.11 1.17 1.26 1.35 1.46 1.57 1.68 1.78 1.91 0.030 0.039 0.049 0.058 0.053 0.070 0.075 0.080 0.103 0.099 dimorphism a t low levels of true dimorphism (less than 1.3) and greater at higher levels of true dimorphism. Random fluctuation in sex ratio Results from experiments where the sex ratio of each sample was randomly selected demonstrate that, overall, the mean method provides the best estimates of dimorphism (Table 6). This result is repeated at all levels of intrasexual variability and at all sample sizes. Surprisingly, average CV-based estimates of dimorphism are nearly as good as those from the mean method, though standard deviations of dimorphism estimates are consistently higher than those from the mean method. Neither the FMA nor the median method provides better estimates of dimorphism than the CV or mean methods at any level of true dimorphism. DISCUSSION Two features of the various methods need to be evaluated to determine which is best for assessing dimorphism in fossils. First, which method generally provides estimates of dimorphism that are close to the actual values of dimorphism present in a sample? Second, where dimorphism is slight, which method provides a reliable upper limit to dimorphism for samples where dimorphism is relatively low? The results of this study demonstrate that, among the techniques examined here, the best way of estimating dimorphism in a sample where the sex of individuals is unknown is the simplestdivision of the sam- 473 ple about the mean into two subsamples. Compared to other methods, the mean method is the most accurate even when intrasexual variability is high and when sex ratios are strongly biased. At low levels of dimorphism (less than about 1.21, the mean method usually overestimates dimorphism a bit more than the other three methods and seems most appropriate for setting a conservative upper limit on the amount of dimorphism that could be present in a sample. Because such estimates represent upper limits, care should be taken not to interpret these estimates as evidence for any particular degree of dimorphism. The CV method provides estimates of sexual dimorphism that are almost as accurate as those of the mean method, although it is more sensitive to departures from a balanced sex ratio and increasing intrasexual variability. The accuracy of the method is contingent upon deriving a regression equation that appropriately accounts for variation in sex ratio and intrasexual variability (assuming that some characters are more variable than others). Potentially, if the limits of intrasexual variability in a character can be established in extant species, approximate empirical “confidenceintervals” could be derived by estimating dimorphism from regressions derived for the upper and lower limits of expected variability. The utility of such a procedure is limited, though, by the decreasing correlation between the CV and dimorphism at progressively higher levels of sexual dimorphism with increasing intrasexual variability. This is clear in Figure 2, where the scatter of points about the lower right portion of the plots increases dramatically with increasing intrasexual variability. Furthermore, uncertainty about whether levels of variability in extant species truly reflect the situation in the fossil record precludes assigning meaningul probabilities to such confidence intervals. At first glance, the results for the FMA method suggest that it is of little utility for estimating dimorphism in fossil samples. The results for the FMA method are not surprising, though, since with increasing dimorphism the assumptions of the method are violated. The FMA method is potentially useful under the circumstances for which it 474 J.M. PLAVCAN TABLE 6. Comparison of the averages and standard deviations of the actual dimorphism us. estimated dimorphism for populations with randomly selected sex ratios (males and females CVs = 5.5, combined-sex sample size = 10) True’ dimorphism 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 cv Observed FMA Mean Median Average sd Average sd Average sd Average sd Average sd 0.99 1.10 1.19 1.31 1.40 1.50 1.61 1.70 1.80 1.90 0.034 0.037 0.049 0.044 0.049 0.053 0.067 0.066 0.065 0.066 1.07 1.11 1.19 1.30 1.39 1.48 1.58 1.68 1.80 1.89 0.030 0.038 0.047 0.052 0.066 0.075 0.106 0.121 0.119 0.145 1.08 1.11 1.15 1.20 1.24 1.27 1.31 1.34 1.39 1.43 0.023 0.027 0.030 0.031 0.027 0.035 0.043 0.046 0.050 0.066 1.09 1.12 1.19 1.30 1.39 1.48 1.57 1.68 1.80 1.90 0.024 0.031 0.045 0.056 0.070 0.074 0.105 0.094 0.069 0.072 1.09 1.12 1.18 1.26 1.33 1.39 1.44 1.53 1.61 1.65 0.023 0.030 0.040 0.056 0.072 0.090 0.123 0.144 0.151 0.169 ‘“True”is the level of dimorphism used to generate the populations,and is thus not a mean. “Observed”is the actual dimorphism of the randomly generatedpopulations.All means are based on 100 populationsper level of true dimorphism. The median method has little to recomwas designed-estimating the maximum dimorphism that could be present in a unimo- mend it in comparison to the other three dally distributed population. Unimodal dis- methods. Under the best of circumstances tributions can be generated from popu- (low intrasexual variability, balanced sex lations showing dimorphism of approxi- ratio) it performs nearly as well as the mean mately 1.3 (Godfrey et al., 1993) or more method, but it is much more sensitive to dewhen sample sizes are small. Under all con- partures from a balanced sex ratio and inditions, the mean method provides, on aver- creased intrasexual variability. It must be noted that with increasing age, slightly higher estimates of dimorphism when true dimorphism is less than sample sizes, an unbalanced sex ratio be1.3 and so seems to be better suited to the comes less of a problem. The likelihood of purpose of setting an upper limit on dimor- drawing any particular sex ratio in a sample phism in a sample. However, all methods is easily calculated from a binomial probaoverestimate dimorphism in these cases, bility distribution. Assuming an equal sex and the lower estimates of the FMA method ratio in the original population and an equal in fact are more accurate. The problem is probability of males and females being prethat with decreasing intrasexual variability, served (neither assumption is necessarily the FMA method actually underestimates true [Oxnard, 198711, one can calculate that dimorphism at progressively lower levels of at a sample size of ten, there is a 95% probatrue dimorphism. Thus, for characters that bility that the sex ratio should not exceed typically show reltively low variability, such about 4:l in favor of either sex. However, at as the teeth, it must be kept in mind that the a sample size of 20, this ratio falls to about FMA method can potentially underestimate 2.3:l. Conversely, for samples of fewer than rather than overestimate dimorphism. six individuals, the chances are better than Since intrasexual variability cannot be 5% that a sample will be composed of only known where the sex of individuals cannot one sex. This latter point should not be forbe reliably determined, this technique gotten when dealing with small samples, where estimates of dimorphism could potenshould not be used alone. As for the CV method, the FMA method tially be calculated on small samples comcan be adjusted to fit different assumptions posed of only one sex! It is interesting to note that for the mean, about intrasexual variability and sex ratios (Godfrey et al., 1993). Potentially, this could median, and FMA methods, standard deviaallow one to set approximate confidence in- tions of dimorphism estimates tend to be tervals on the estimate of maximum dimor- lower than those of observed dimorphism, phism in a population, though such a prac- especially when the true dimorphism is low. tice is potentially subject to the same This means that, even though these methods tend to overestimate the true dimorproblems as the CV technique. FOUR METHODS FOR ESTIMATING SEXUAL DIMORPHISM 475 phism (at low levels of true dimorphism), the CONCLUSIONS estimates themselves are actually less variThe problem of estimating sexual dimorable than estimates generated when the sex phism in small fossil samples where the sex of individual specimens is known. The only of individuals cannot be reliably determined exception t o this occurs when the mean has no easy solution. Even though the mean method is used to estimate dimorphism method is overall the best of the four techwhen the sex ratio is unbalanced, but this is niques investigated here, in fact none of the only likely to be a substantial problem with methods provides reliable estimates of dismall sample sizes (less than about ten). morphism when true sexual dimorphism is Thus, even when a sample of fossils contains low. All four methods are confounded by a few individuals that can be sexed, these fluctuation in sex ratio and intrasexual techniques are still useful in providing sup- variability. While it is possible to adjust the plemental upper estimates of dimorphism if FMA and CV techniques to account for fluca large number of specimens that cannot be tuation in these parameters, the very fact identified by sex can be used. For example, if that the sex of individuals is not known a collection consists of 20 jaws containing means that neither the sex ratio nor intrathe first molar tooth, but only a few of these sexual variability can be known a priori. jaws can be sexed, variability-based esti- When sexual dimorphism is low such that mates could be useful for setting an upper sample distributions appear t o be unimodal, limit on the amount of sexual dimorphism the best that can be hoped for is to estimate that could be present in the molar teeth us- the maximum amount of dimorphism that ing the entire sample rather than just a few could be present in the sample (Godfrey et individuals. Thus, a variability-based esti- al., 1993). mate of dimorphism can help control for sampling error in the estimate of dimorACKNOWLEDGMENTS phism based only on the sexed individuals. I thank Richard F. Kay for advice and Finally, all of the methods examined here help. Gene Albrecht, Dana Cope, Rebecca strongly overestimate sexual dimorphism German, Laurie Godfrey, Bill Hylander, Jay when intrasexual variability is large and the Kelley, Rick Madden, Mike Sutherland, and actual population dimorphism is low. Under two anonymous reviewers provided helpful these conditions, it is unlikely that any comments and discussion. Dave Hertwick method that uses sample variation can accuprovided assistance with computers. This rately estimate low to moderate levels of diresearch was supported by NSF dissertation morphism (from 1.0 to roughly 1.3), since the total sample variation due to the separa- improvement grant BNS 8814060 and NIDR tion between male and female means is postdoctoral fellowship 5 F32 DE05605-02. quickly overwhelmed by that from intrasexLITERATURE CITED ual variability. With more variable characAndrews P (1983) The natural history of Siuapzthecus. ters such as limb bone dimensions, the probIn RL Ciochon and RS Corruccini (eds.): New Interlem of estimating dimorphism with any of pretations of Ape and Human Ancestry. New York: these methods becomes worse. Therefore, Plenum, pp. 441-463. before estimating dimorphism in any char- Box GEP, and Mueller ME (1958)A note on the generation of random normal deviates. Ann. Math. Statist. acter in a fossil sample, the potential in29t610-611. traspecific variability must be considered, TH, Harvey PH, and Rudder B (1977) presumably by examining variability in the Clutton-Brock Sexual dimorphism, socionomic sex ratio, and body trait within a series of extant species. For weight in primates. Nature 269t191-195. many skeletal traits in samples where low to Cope DA (1989) Systematic Variation in Cercopithecus Dental Samples. PhD dissertation, The University of moderate levels of dimorphism are susTexas at Austin. pected, intrasexual variability may preclude the use of any technique to provide anything Cope DA (1993) Measures of dental variation as indicators of multiple taxa in samples of sympatric Cercoother than an estimate of the maximum pithecus species. In WH KimbeI and LB Martin (eds.): amount of sexual dimorphism that could be Species, Species Concepts, and Primate Evolution. New York: Plenum, pp. 211-237. present in a sample. 476 J.M. PLAVCAN Fleagle JG, Kay RF, and Simons EL (1980) Sexual dimorphism in early anthropoids. Nature 287t328-330. Gingerich PD (1974) Size variability in the teeth of living mammals and the diagnosis of closely related sympatric fossil species. J. Paleontol. 48r895-903. Godfrey LR, Lyon SK, and Sutherland MR (1993) Sexual dimorphism in large-bodied primates: The case of the subfossil lemurs. Am. J . Phys. Anthropol. 9Ot315-334. Greenfield LO (1992)Relative canine size, behavior, and diet in male ceboids. J . Hum. Evol. 23t469-480. Harvey PH, Kavanagh M, and Clutton-Brock TH (1978) Sexual dimorphism in primate teeth. J . Zool. Lond. 186:475485. Kay RF (1982a) Sexual dimorphism in the Ramapithecinae. Proc. Natl. Acad. Sci U.S.A. 79t209-212. Kay RF (1982b) Siuapithecus szrnonsi, a new species of Miocene hominoid, with comments on the phylogenetic status of Ramapithecinae. Int. J . Primatol. 3: 113-1 73. Kay RF, Plavcan JM, Glander KE and Wright PC (1988) Sexual selection and canine dimorphism in New World monkeys. Am. J. Phys. Anthropol. 77t385-397. Kelley J (1993) Taxonomic implications of sexual dimorphism in Lufengpithecus. In WH Kimbel and LB Martin (eds.): Species, Species Concepts, and Primate Evolution. New York Plenum, pp. 429458. Krishtalka L, Stucky RK, and Beard KC (1990) The earliest fossil evidence for sexual dimorphism in Primates. Proc. Natl. Acad. Sci U.S.A. 87.5223-5226. Leutenegger W, and Kelley J T (1977) Relationship of sexual dimorphism in canine size and body size to social, behavioral, and ecological correlates in anthropoid primates. Primates 18r117-136. Leutenegger W, and Shell B (1987) Variability and sexual dimorphism in canine size ofAustrulopithecus and extant hominoids. J. Hum. Evol. 16t359-367. Martin LB (1983) The Relationships of Later Miocene Hominodea. PhD thesis, University College London. Martin LB, and Andrews P (1984) The phylogenetic po- sition of Graecopithecus freybergi Koenigswald. Cour. Forsch. Senken. 69r25-40. Martin LB, and Andrews P (1993) Species recognition in Middle Miocene hominoids. In WH Kimbel and LB Martin (eds.): Species, Species Concepts, and Primate Evolution. New York Plenum, pp. 393-427. McHenry HM (1991) Sexual dimorphism in Australopithecus afurensis. J. Hum. Evol. 2Ot21-32. Oxnard CE (1987) Fossils, Teeth and Sex. Seattle: University of Washington Press. Pearson E (1932)The percentage limits for the distribution of range in samples from a normal population. Biometrika 24t404-417. Plavcan JM (1990) Sexual dimorphism in the dentition of extant anthropoid primates. PhD dissertation, Duke University. Plavcan JM (1993) Catarrhine dental variability and species recognition in the fossil record. In WH Kimbel and LB Martin (eds.): Species, Species Concepts, and Primate Evolution. New York: Plenum, pp. 239-263. Plavcan JM, and van Schaik CP (1992) Intrasexual competition and canine dimorphism in anthropoid primates. Am. J. Phys. Anthropol. 87:461-477. Simpson GG, Roe A and Lewontin RC (1960) Quantitative Zoology. New York: Harcourt Brace and Co. Teaford MF, Walker A, and Mugaisi GS (1993) Species discrimination in Proconsul from Rusinga and Mfangano islands, Kenya. In WH Kimbel and LB Martin (eds.): Species, Species Concepts, and Primate Evolution. New York: Plenum, pp. 373-392. Vitzthum VJ (1990) Odontometric variation within and between taxonomic levels of Cercopithecidae: Implications for interpretations of fossil samples. Hum. Evol. 5t359-374. Wu R, and Oxnard C (1983) Rurnupithecus and Siuupithecus from China and some implications for higher primate evolution. Am. J . Primatol. 5t303-344. Yablakov AV (1974) Variability in Mammals. New Delhi: Amerind Publishing Co.