AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY 52341-349 1980) Aspects of the Distribution of HbS in the United States BRAXTON M. ALFRED Department of Anthropology and Sociology, and Laboratoq of Physical Anthropolog,y, University of British Columbia, Vancouuer, British Columbia, V 6 T 1W5 KEY WORDS HbS, Age, Sex, Stable populations, Fitness, Log-likelihood, Fertility ABSTRACT Frequencies of HbS obtained by several screening clinics a r e analyzed for age, sex, and location effects. All seem to be present in some form, though age and sex effects may be conditional on location. An attempt is made to elaborate the common observation of increasing frequency with age. This is shown to be the result of differences in fertility favoring the normal. A simulation which includes 25% admixture was done. The results indicate a genetically relevant New World experience for the population to be about 9- 12 generations with the heterozygote having fitness of 0.9GO.99. Classical genetic theory predicts a decline over time in the frequency of a n allele which is maintained soley by heterozygote advantage when the selective agent is no longer present. The evidence for P. falciparum malaria being central in maintaining the sickle cell polymorphism in West Africa is taken as convincing (Serjeant, '74). There the normal, AA, genotype has estimated fitness 0.89 relative to the heterozygote, AS (Cavalli-Sforzaand Bodmer, '7 1). In the sequel the odds for AS, defined as the ratio of the observed frequency of AS to that of AA, will be analyzed. The reasons for using odds rather t h a n relative frequencies a r e that a simpler description and statistical analysis result, and that this ratio is conceptually close to a common definition of relative fitness. In West Africa the odds for AS are estimated as 54821 25374 = 0.2160 (Cavalli-Sforza and Bodmer, '71). The odds for AS in the United States a r e currently estimated as 2710131037 = 0.0873. The south-east coastal areas of the U.S. a r e sub-tropical becoming temperate inland. So P. vivax would be expected to predominate but P. falciparum would not be uncommon on the coast (Russell, '52). Removal of West Africans to this area should alter the relative fitness of AA such that i t i s at least as great as that of AS for all but possibly the coastal areas. I n the West African context the superiority of AS is usually attributed to its ameliorating effect on mortality by protecting children a g a i n s t malaria until natural immunity is acquired. I t 0002-948Y18015203-0341$02 00 0 1980 ALAN R I.ISS, INC has been recognized for some time t h a t this effect alone may not be sufficient to account for contemporary observations of the frequency of AS; and now some evidence suggests a n effect on fertility as well (Eaton and Mucha, '71 ). The magnitude of this effect, however, is thought to be small (Cavalli-Sforza and Bodmer, '71), and, I add, its direction unclear. By 1970 there were about :% million Negroes in the U.S. of whom about 34 were native born (Thompson and Lewis, '65) Assuming t h e 1968 estimated g e n e r a t i o n t i m e of 25.3 y e a r s (Keyfitzand Flieger, '68)to have been constant, then the gene pool of U S . born Negroes has experienced at least seven generations in a n environment relatively free offalciparum. It is commonly assumed t h a t this experience is of the order of 1@12 generations. Reed ('69) estimates the amount of white admixture currently present in U.S. Negroes to be about 25% . Assuming a constant rate of admixture, y , the total amount at generation h is proportional to ( y t l ) h= 1 + 0.25 which may be solved for y = exp[(l/k)ln 1.251 - 1. For example, when k = 10, y = 0.0226 i s t h e amount of admixture per generation, which over ten generations results in 25%. This effect is largely independent of fitness and the result is to accelerate the decrease in odds for AS. Considering the sickling allele as a lethal recessive leads to the expectation for its freReceived J u l y :il. 1978: accepted J u n e 20, 1979 34 1 342 BRAXTON M. ALFRED quency p, = p,/l + tp,, at time t where p,) is the starting frequency (Crow and Kimura, '70). So the genotypic odds for type AS at time t a r e given by ~~ ~~~ 1 2PU + p,,it - 1) Taking p,, = 0.09 and t = 10 generations results in odds 0.10. However, when the 25% admixture is included the expected odds a r e near 0.05. Thus there is a discrepancy between observed and expected odds. One of the central concerns of this paper is the resolution of this difference. Before considering this, I address the logically prior topic of t h e fitness of the heterozygote. A common observation in the U.S. is that the trait frequency increases with age. If the major part of a fitness differential is due to mortality the frequency should decline with age. The analysis of these features of the current distribution leads to a new interpretation of the history of the trait as well as advancing understanding of its role in fitness under relaxed selection. THE AGE EFFECT In this section the data from screening projects are analyzed for age, and other effects. The data from Florida are due to Wienker ('74); those from New York are due to Janerich, et al. ('73);and those from Stockton a r e from a clinic conducted by the University of California, San Francisco (Alfred, et al., '78).Table 1 presents the observed frequencies of AA and AS from these three sources. The Tampa data were generated by the screening program of the Hillsborough County Health Department which included both stationary and mobile clinics. The New York data were produced by the New York State Department of Health and exclude only New York City. The Stockton observations were made at a stationary clinic. Attention is restricted to the two genotypes AA and AS to reduce bias due to prior knowledge of genotype by t h e patient. T h e age categories a r e those used by Janerich et al. ('73) to effect comparability. The minimum age considered is one year. In Figure 1 will he found the odds for AS by age category, sex, and location. As may he observed, there is considerable apparent variability around t h e overall expected odds. It should be noted t h a t the mid-point of age category 1is 5.5 years, of category 2 it is 15years, of category 3 i t is 25 years. The mid-point of category 4 is greater than 45 years. METHODS For details and extensions of the techniques used here refer to Goodman ('70, '71, ' 7 2 ) ,and Bishop, et al. ('75). The four way frequency distribution in Table 1can be represented by the general element f,,L, where i = 1 , 2 , 3 , 4indexes the 4 age categories (variable A ) ,j = 1, 2 indexes male and female (S), k = 1 , 2 , 3 indexes the 3 locations (L),and 1 = 1, 2 indexes normal or sickler genotype iGj. The variable G will be considered the dependent variable. Various models, hypotheses, for the observations may he constructed by differe n t combinations of the marginal frequencies. For example, the hypothesis that genotype is independent of age-sex-location jointly will be represented by the notation "ASL,G" and asserts that the observations can he adequately reproduced by knowing only the genotype frequencies and, separately, the joint frequencies TABLE 1 . Ohserued frcqiiencies vf A A cmd AS nt Trirnpci, New Yor-k, und Stockton by age nnd Tampa Location tL): SEX New York Stockton -~ AA AS AA AS AA AS Male 1 to 10yedrs 10 to 20 years 20 to 30year s ' 30 years 2714 1312 357 464 274 153 47 48 :3 129 3874 664 546 229 293 56 52 73 45 18 23 14 2 3 2 Female 1 to 10 years 10 to 20 years 20 to 30 year5 30 years 2915 2168 1190 1045 256 251 134 134 3339 4711 1002 1202 250 311 83 97 74 88 35 49 Genotype [GI: Sex IS),Age ( A ) I ~ 343 DISTRIBUTION OF HbS 0.200 0.175 0.1 50 x 0.125 c 0 c al -4 L 0 0.075 -0 0 0.050 0 0.025 0.000 FEMALE MALE 1 I I 2 3 1 4 A g e C ategor y Fig. 1. Odds for AS by age category, sex, and location. of ASL. The hypothesis that genotype and age a r e related independently of ASL is represented by “ASL,GA”. The assertion now is that one must know t h e genotype-age frequencies in addition to the ASL frequencies. And so on. The object of the procedure is to locate the simplest combination of marginal frequencies required to reproduce t h e data satisfactorily. The procedures that have been developed for treating problems of this sort are directly analogous to analysis of variance without the assumptions of normality and homoscedasticity. When a term such as “ G A S appears in a model, the hierarchical principle implies t h a t all the following terms are present: GAS, GA, GS, AS, G, A, S.The descriptive notation for the models will include only t h e highest order terms, e.g., GAS, required. To further simplify t h e notation t h e term “ASL” will be dropped from model descriptions even though i t is present in all. The reason for holding ASL constant is to remove from further consideration the peculiarities due to the sampling frame. That is, the uniqueness of the age-sex distribution at each location is of interest only with regard to its effect on G so that source of variability is statistically removed and subsequent tests are performed on the residual variability. The test statistic used is the log-likelihood ratio chi-square (LLR) which has considerable advantages over the Pearson version. In Bishop e t al. (’75)the statistic is named G’, which, due to the obvious conflict in notation, will not be used. RESULTS An hypothesis is specified by dropping one or more of t h e possible terms. The parameter which weights t h e term is set to zero. The hypothesis is then tested by determining the BKAXTON M. ALFRED 344 log-likelihood ratio chi-square for goodness of fit. When a given term is assumed present the marginal frequencies specified by the descriptive notation are fixed. In Table 2, in the column headed “Model”are found only the highest order terms which are assumed to be present by hypothesis. A model, e g . , no. 6 in Table 2, specified as GA, GL (with ASL understood) may be read as the 3-way interaction of age-sex-location plus the 2-way interaction of genotypeage plus the 2-way interaction of genotype-location. Model no. 1 is the hypothesis of the independence of genotype and all other variablesjointly; clearly it does not fit the data. And model no. 18 tests the effect of dropping the highest order interaction, i.e., GASL; as the probability of the fit without it is greater than 0.10, it is assumed to be unnecessary. In a routine search of a table such as this, beginning with the simplest model and adding “one step’’terms to improve the fit, i.e., reduce the chi-square value, is called the forward procedure; beginning with the most complex and dropping terms which are unnecessary in order to simplify the model, is called the backward procedure. Working forward through Table 2 it is observed that adding the GL interaction results in a significant improvement in fit over the independence model-chi-square has been reduced by 119.16 36.99 = 82.17 which with 23 - 21 = 2 degrees of freedom is highly significant. So, while the GL interaction alone may not be suficient (chi-square = 36.99 with 21 ~ degrees of freedom and probability 0.017), it should be included in some form in the final model. Adding the GA term reduces chi-square by 12.42 which, with 3 degrees of freedom, is significant a t the 0.01 level. If now the GS term should be included, chi-square is reduced by 2.72 which, with 1 degree of freedom, is just significant a t the 0.10 level. It is questionable whether the GS term should be included but the forward procedure terminates either a t model no. 6 or a t model no. t%i.e., there is no other term that will effect a significant reduction in chi-square. Working backward from model no. 18, if the GSL term is dropped, model no. 15 results and chi-square is not significantly changed. Dropping A from the GAS term produces model no. 12 and again no significant increase. And dropping the GAL interaction from the GAL term (i.e., to model no. 8 ) causes a change of 9.33 in chi-square which, with 6 degrees of freedom, is not significant. And once more the problem of whether the GS term should be included or dropped ti.e., to model no. 6) arises. Simplicity would require t h a t i t be dropped. And the backward procedure, then, terminates a t either model no. 8 or a t model no. G-i.e., no other term can be dropped which will not significantly increase the value of chi-square. ( I t should be mentioned that the backward path is not unique. The branching from model no. 15 could have gone to model no. 11 and thence to model no. 6 with equal validity, thereby obviating the need to consider the inclusion of the GS TABLE 2 . Tests of irll possible hypotheses regarding the odds f i r the A S genotype Log-1ikeli hood Model 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ia G GA GS GI, GA,GS GA,GL GS,GL GA,GS,GL GAS GAS, G1. GAL GAL, GS GSL CSI,, GA GAS, GAL GAS,GSL GAL, GSL GAS. GAL, GSL Degrees of freedom rat10 chi-square Probabi I i ty 23 20 22 21 19 18 20 17 16 14 I2 11 18 15 119.16 102.93 118.65 36.99 101.43 24.57 35.66 21.85 100.72 21.04 15.50 12.52 33.85 20.17 11.70 19.34 11.36 10.49 0.000 0.000 0.000 0.017 0.000 0.137 0.017 0.191 0.000 0.101 0.215 0.326 0.013 0.166 0.165 0.081 0.252 0.106 8 12 9 6 345 DISTRIBUTION OF HbS TABLE 3. Tests of all possible hypotheses regurriing the odds for the AS genotype ot each locution separately Tampa Stock ton Model 1. 2. 3. 4. G GA GS GA, GS New York -~ Degrees of freedom LLR chi-square P LLR chi-square P chi-square P 7 4 8.31 7.35 6.40 5.72 0.306 0.118 0.380 0.126 18.93 4.59 18.85 3.68 0.008 0.331 0.005 0.296 9.76 3.56 8.62 1.96 0.203 0.469 0.196 0.581 6 3 term.) So, to this point in the analysis, the parsimonious model is ASL, GA, GL. Note that a n age effect is evident even a t this level. The GS term, potentially of considerable interest, will not be retained at this level, but will reappear a t lower levels a s will be seen shortly. If model no. 6, which is a “non-elementary” hypothesis (Goodman, ’70) in the sense that it cannot be expressed in simple conditional probabilities, is accepted then we should consider each of the locations separately. Table 3 contains tests of all possible hypotheses a t each of the locations. (When considering the data by location, the age-sex, AS, term is included in all models.) Only in the Tampa data is there reason to accept an hypothesis other than the independence of genotype and all other variables jointly though it is noted that the GA term improves the fit significantly for New York and GS slightly improves the fit at Stockton. Thus the Tampa and New York data appear to include a n age-genotype interaction. Consequently we may conclude that the presence of a n age-genotype interaction in Table 1is due to those sources, and the possible sex-genotype interaction is due mainly to the Stockton data. Fortunately Dr. Wienker’s data include information on birthplace. These frequencies are presented in Table 4, I. None of the possible unsaturated models provides a n adequate fit of the observed odds for the AS genotype; thus the highest order interaction is required for these data. However, considering each of the three birthplaces separately (Table 4, 111, note that for those born in Florida but not in Tampa, the is 6.49 model of independence fit-hi-square with 7 degrees of freedom and probability 0.484. For those born outside Florida the model which includes the age-genotype interaction is required+hi-square 3.74 with 4 degrees of freedom and probability 0.442. Only for those born in Tampa is an age-sex-genotype interaction required. 1,LR Now we may conclude that the age-genotype interaction included in model no. 6 (Table 2) is due primarily to the “Tampa born,” those “born outside of Florida” and New York subsets of the total; and the possible sex-genotype interaction in model no. 8 is due to the Tampa born and Stockton populations. It should be pointed out that the age effect is present in the total data set. Detailed analysis of the recognizable subsets indicates the possible source of the effect, but in no way do these results invalidate concluding that the effect characterizes the U.S. distribution. The sex independent odds for the population born outside Florida are, for each of the four age categories: 0.10,0.13,0.12,0.11.For New York the same odds are: 0.07,0.07,0.08,0.09.For the Tampa born males the odds are: 0.09,0.08,0.14, 0.05; and for females: 0.07, 0.11, 0.08, 0.20. (Herethere seems to be a tendency for the male odds to decrease and the female odds to increase with age.) Overall, the odds are 0.08,0.08,0.10, 0.10. It is clear by inspection that the effect is not a monotonic decline with age. DISCUSSION The locales producing a statistically significant age trend are New York, the “Tampa born” and “born outside Florida” subsets. In none of the data sets is there a suggestion of a decline other than the sex-conditional trend for Tampa born males. This will be discussed further. Historically the black migration from the South to the Northeast was concentrated in the period before World War 11. On the strength of the similarity of odds between “Tampa born” and New York, this migration would appear to have been from urban areas. The main westward movement occurred during and after the war. The similarity of odds between “Florida but not Tampa” and Stockton suggests that this was a movement of predominantly rural popu- 346 HFiAXTON M. AI,FRED TABLE 4 . Frequencies, I , of A A and A S ohserued, and tests of hypotheses, 11, at Tornpa hy age, sex, and hrrthplace I Birthplace tL): Tampa Genotype 1 G 1: Sex tS) Age tAl Florida Other AA AS AA AS AA AS 717 65 45 17 4 89 86 36 47 8 12 7 9 1908 65 1 209 201 96 343 35 770 61 87 236 125 19 25 72 137 94 121 5 12 11 13 2006 1261 860 799 190 152 104 96 Male 1t o . 10 years 10 to c 20 years 20 t o . 30 years 3 30 years 575 118 74 23 Female 1t o 10 years 10 to 20 years 20 to <. 30 yeari ’30 years 8.37 II Birthplace: Tampa Measure: Ll,R chi-square Florida P Other 1,I.R chi-square P LLR chi-square P 6.49 X98 4.05 0.26 0.484 0.408 0.670 0.967 13.81 :3.74 13.52 2.47 0.054 0 442 0.035 0.480 Model’ -~ 1. 2. 3. 4. G GA GS GA, GS ’ Fikwrrs tn p;trt.nthe\ei 17) (4) t6J 13) 22.42 14.30 21.76 13.94 0.002 0.006 0.002 n.oo:j :ire degrees of frccdom lations. Thus i t appears unnecessary to invoke micro-environmental explanations for the observed differences among locales, except for the rural-urban difference in the Southeast. This difference has been adequately discussed and documented by many (Pollitzer, e t al., ’70). All locales are characterized by apparent sex effects but only in the “Tampa born” does this reach statistically detectable levels. There the marginal odds are 0.088 for males and 0.098 for females. If this were a n X-chromosome mediated effect, one might expect the difference to be constant over all ages. It is not. So either the effect is more complex or it is a sampling artifact. If the effect were real and t h e same among locales one would expect similar results from all sites. This is not observed. Given the magnitude of the Tampa project one must invent complex sampling scenarios to reproduce the observations by sampling bias. These data cannot elaborate further. Janerich was able to exclude bias due to factors related to hematocrit level. An observation which is not fully understood but commonly made in the U S . involves a n increase in the frequency of type AS with age. This topic will now be considered. The regression of In odds on age, In (odds) = a + b(age), was done and the results are below: a b r:! parameter -2.57 0.01 0.71 In a demographically stable population, the proportion of people in the age range rr to n + d o , c(a),is given by c ( a )= b exp(- a r ) p ( a ) where h is the birthrate, I‘ the intrinsic rate of increase andp/rz) the probability of surviving from birth to age ci (Keyfitz, ’68);r is defined by r = b - d where d is the death rate. Writing c,(a) and c,(a) for the proportions of sicklers and normals respectively in the age range, the odds for AS, w,(a), are w,!a) ~ ~ c,ia) c,ln) ~ h, expi --ar,)p,ia) -~ b, expi -ar,)p,!:i) 11) DISTRIBUTION OF HbS Taking natural logs and rearranging (In p,ia) ar,) As a crude first approximation, assume t h a t mortality, q, is a constant pressure acting equally a t all ages. Then survivorship can be q)". The represented in t h e form pia) = (1 terms involving p ( a ) above can now be written In p ( a )= a In(l - q) then substituting b q = r into (2) the expression for In odds becomes - - 13) q,) + q , ) tlntl ~ ib, b,) I ~ - which is linear i n a . Note that terms of the form In(1 q) + q are asymptotically zero as 0 si q s 1 goes to zero. For q < O.Ol,ln(l q) + q 5x lo-' and, for present purposes, may be set to zero. Keyfitz and Flieger ('68)estimate the U.S. nonwhite stable population from the 1960 observed population. There the death rate is estimated as 6.84 per thousand; with this estimate the terms under consideration become ln(l 0.00684) t 0.00684 - -2 x 10 This means that any slope in the In w may be considered to bedue to differences in birth rates; i.e., ~ ~ - -I. - So when b, -,b,, i.e., when the birth rate of normals is greater than that for heterozygotes, the slope of the odds regressed on age is positive, which is somewhat counterintuitive. The parameters estimated earlier may now be displayed as or bJb, = 0.08, and by - b, = 0.01. Keyfitz and Flieger ('68) estimate the total birth rate about 0.035. Now there is a system of three equations i n two unknowns: 0.08b, h, b, I 11\ + b, b, - = = 0 0.01 0.035 An approximate solution of this system exists: b, = 0.023 and b, = 0.009. The implication is t h a t the birth r a t e among normals is about 2.5 times higher t h a n among heterozygotes. This should not be taken as a n exact estimate, how- 347 ever, due to the sources of error which were included as a result of simplification at three places. First was the survivorship function. It seems quite reasonable to assume a n exponential decay function to represent the force of mortality. Reality is not so simple unfortunately. The magnitude of the error introduced by this assumption is unknown but should not be unacceptably large relative to other sources. Second was the assumption of demographic stability. The U.S. nonwhite population has not achieved its stable distribution yet. But again, making the assumption allows analysis to proceed without, i t is hoped, distorting t h e outcome too greatly. And third was the assumption that mortality is insignificant relative to fertility. This is probably the least serious. Consider the quantity in brackets in equation 3. Each of the first two terms is certainly less t h a n 10- ' and their difference less than lo-". When combined with the last term, birth rate difference in the neighborhood of or greater, it is seen that the effect of mortality can legitimately be assumed negligible. The results obtained here strongly indicate t h a t compensatory fertility f a v o r i n g AS (Cavalli-Sforza and Bodmer, '71) is not occurring; quite the contrary. In the next section I shall investigate the historical dynamic of the change in frequency of the heterozygote. THE: EVOLUTIONARY DYNAMIC WITH ADMIXTURE AND DIFFERENTIAL FERTILITY In this section I will consider the effect of heterozygote fitness on the odds in the presence of admixture. A set of experiments with a deterministic simulator using discrete time (generations) was performed. The logic will be discussed below. The following constraints apply: 1)initial odds a r e 0.216 and terminal odds 0.087; 2) admixture occurs at a constant rate such that the total is 25%; 3)only the time periods beginning 7-12 generations ago are studied; 4) the homozygote for S has fitness 0.1. The object is to determine the functional relationship between heterozygote fitness and time within these bounds. The assumption t h a t initial odds for this process are identical to current odds in Africa is acknowledged to be questionable. The arguments against i t are cogent but not compelling. Some starting point is required, however, and this particular one can at least be rationalized. Admixture was assumed to be structured as one-way selective gene flow by equally increasing the frequencies of the matings AA x AA, 348 BRAXTON M. ALFRED AA x AS, AA x SS over that expected under a random mating regimen in accordance with the expression obtained earlier. I t h a s been suggested that temporal variability in admixture would noticeably modify predictions. Consider the extreme case of all admixture occurring in generation 1. The equilibrium odds for AS change from 2q/p to [q(2 + ap)J/ p[ 1 + ap(1 + q)l where q is the frequency of the allele for type S, initially estimated as 0.09, p = 1 - q, and n is the admixture rate structured as described above and estimated as 1.25. Numerically the odds change from 0.20 to 0.14, about 30%. This result serves as a caution in interpreting predictions based on the assumption of a constant admixture rate. Likewise spatial variability in allele frequency with low interdemic flow, and selective admixture could generate equally severe disturbances. The only selective force presumed to be operating is differential fertility. Thus the product of the fitness of the partners in a mnting is taken as the fertility of the pair. The random mating frequencies are obtained and modified by admixture. These matings then produce the offspring for the next generation. The results of the experiments are presented here as ordered pairs, the first member of which,g, is the number of generations, and the second, f,is the fitness of the heterozygote: ( 7 , 0.925),18,0.945),19,0.96),i10,0.97),(11,0.98), (12,0.986).These pairs are to be read as “if the number of generations was g the heterozygote must have fitness f in order to change the odds from 0.216 to 0.087.” There is no way to select among these alternatives with confidence a t this time. Intuitively, however, motivated by previous results, a fitness for the heterozygote less than 0.925 would almost certainly have been recognized. So the conclusion is reached that the genetically relevant New World experience of this gene pool is about 9-12 generations with heterozygote fitness about 0.96 - 0.99. A second set of experiments was conducted to investigate the effect of “compensatory fertility” (Cavilli-Sforza and Bodmer, ’71). This factor, when included, was assumed to increase the fertility of AS only and so affected the fertility of matings involving this type by 4%. (Double heterozygote mating fertility was also increased by 4% .) For the period being investigated, the effect of admixture on the predicted odds was minor overall, being of the approximate magnitude of 0.004 to 0.001 lower than with no admixture. The effect of compensatory fertility with no admixture was about a n order of magnitude larger, i.e., about 0.03 to 0.01 higher than without the differential. The combined effect of admixture and compensatory fertility was intermediate but expectably nearer t h e value obtained with the fertility differential alone. A third series of model experiments was performed with admixture rate assumed to be proportional to genotypic fitness such that the total was 25%. The rate of decay of predicted odds was only slightly greater than when admixture was assumed to be independent of fitness. And finally a series similar to the first was done using the assumption that the fitness of the normal changed linearly from 0.9 to 1.0 over the specified number of generations. The results are: ( 7 , 0.88), ( 8 , 0.90), (9, 0.9151, (10, 0.925),(11,0.935),(12,0.94).Thebasicassumption implies that the fertility of the normal is depressed in the malarial environment and recovers linearly. It will be noted that heterozygote fitnesses seem too low and consequently too much time is required. The qualitative results of the simulation experiments are consistent with those obtained by analyzing the age distribution. Specifically, both indicate that the normal has greater fitness than the heterozygote. The difference is due to fertility a n d is apparently small, probably no larger than a 4% advantage for the normal. CONCLUSIONS The results obtained here were surprising to me initially. Clearly the reason is a bias with regard to the concept of fitness. Specifically when thinking of fitness as survival of individuals, attention is naturally directed to mortality. One then assumes equal fertility combined with selective mortality as the evolutionary dynamic. There is, however, little, if any, evidence for greater mortality among sickle cell heterozygotes than among normals. And, since the frequency apparently increases with age it follows that normals have higher mortality. But if that is so then the overall frequency should be increasing. And it is not. The apparent paradox is an artifact of the initial bias. The sickle cell trait observations in the U.S. are characterized by decaying overall frequency combined with increasing frequency with age. There is no mortality force that will produce these conditions in the presence of equal fertility. One is forced to the conclusion DISTRIBUTION OF HhS that the evolutionary dynamic is one involving a fertility differential favoring t h e normal combined with (effectively) equal mortality. ACKNOWLEDGMENTS I a m deeply indebted to Dr. Malcolm Greig, U.B.C. Computing Centre for assistance with the statistical parts; and t0U.B.C. for sufficient computing time to survive many false starts. LITERATURE CITED Alfred, B., M. Greig, and N. Petrakis (1978) Demographic effects on the distribution of some hemoglobin types, GGPDtA,B) and Duffy tA,B) among blacks in Stockton, California. Can. Rev, Phys. Anth. In press. Bishop, Y., S. Fienberg, and P. Holland (1975)Discrete Multivariate Analysis. MIT Press, Cambridge, Mass. Cavalli-Sforza, L., and W. Bcdmer 11971) The Genetics of Human Populations. W.H. Freeman and Co., San Francisco. Crow, J.F.,and M. Kimura (1970)An Introduction to Population Genetics Theory. Harper and Row, New York. Enton, J., and J. Mucha (1971) Increased fertility in males with the sickle cell trait? Nature 231:456. Goodman, L.A. (1970)The multivariate analysis of qualitative data: Interactions among multiple classifications. J. Am. Stat. Assoc. 6 5 - 2 2 6 2 5 6 . 349 Goodman, L.A. (1971) The analysis of multidimensional contingency tables: Stepwise procedures and direct estimation methods for building models for multiple classifications. Technometrics, 13:3:3-62. Goodman, L.A. (1972) A modified multiple regression approach to the analysis of dichotomous variahles. Am. Sociol. Rev. 37:28-46. Janerich, D., J. Kelly, F. Ziegler, S. Selvin, I. Porter, J. Robinson, and R. Herdman (1973)Age trends in the prevalence of the sickle cell trait. Health Services Reports, US. Department of Health, Education and Welfare 88tYI:804 807. Keyfitz, N. (1968)Introduction to the Mathematics of Population. Addison- Wesley, Reading, Mass. Keyfitz, N., and W. Flieger ( 1968) World Population. University of Chicago Press, Chicago. Pollitzer, W., E. Boyle, J. Coroni, and K. Namboordiri (1970) Physical anthropology of the Negroes of Charleston, S.C. Hum. Biol., 42:265279. Reed, T. (1969)Caucasian genes in American Negroes. Science I6.5762-768. Russell, P. ( 1952)Malaria. Blackwell Scientific Pub., Oxford. Sejeant, G. 11974) The Clinical Features of Sickle-Cell Disease. American Elsevier Publ., New York. Thompson, W., and D. Lewis 11965)Population Problems, 5 ed. McGraw Hill Book Co., San Francisco. Wienker, C. (1974)Geographical and age trends in sickle cell in Florida, the Deep South, and the United States. Paper presented to American Association of Physical Anthropologists, Amherst, Mass.