Empirical Comparison of Distance Equations Using Discrete Traits MICHAEL FINNEGAN AND KEVIN COOPRIDER Osteology Laboratory, Kansas State University, Manhattan, Kansas 66506 . Statistical analysis Inverse sine transformation K E Y WORDS Non-metric equations a - Distance ABSTRACT The use of the Grewal-Smith statistic in measuring biological distance among skeletal population samples has been questioned since it was first applied t o human populations. Recently, in an attempt to stabilize the variance of the Grewal-Smith statistic for use with non-metric analysis, Sjmold ('73) and Green and Suchey ('76) have introduced corrections and alternative transformations which may enhance the meaning of biological distance among population samples. Their recommendations improve the statistics for specific variable ranges; i.e., small sample size and low trait frequencies. Thirteen equations representing Grewal-Smith, Freeman-Tukey, Anscombe, and Bartlett transformations and/or corrections, were compared using rank order correlation statistics on actual biological distances generated by real population data as presented in existing literature. Results from testing these actual distance models show little variation between equations based on the populational data sets used. Based on these findings, the distance model resulting from the Grewal-Smith statistic is not inferior to the more sophisticated models, although the latter may be superior by allowing specific improvements for small sample size andlor low trait frequencies. Since the 1967 article by Berry and Berry, insight has been gained in studies of biological distance and the statistics utilizing this approach. The most widely used statistic has been that of Grewal-Smith, first suggested along with the work of Grewal ('62). The basis of the Grewal-Smith statistic is the transformation of observed frequencies of non-metric traits by using equation [11(APPENDIX). Once Berry and Berry ('67) used t h e Grewal-Smith statistic on human non-metric traits a number of researchers developed variations of this statistic or used other statistics entirely. Most notable is the work of Lane and Sublett ('72), Birkby ('731, Buikstra ('721, Zegura ('73) and Finnegan ('72, '74a). Most of the above research utilized some variation of the Grewal-Smith statistic in an attempt to stabilize the variance and to maximize the information from their data, which was primarily incomplete in terms of numbers in the sample. Although some of these attempts were successful in reducing error in terms of samAM. J. PHYS. ANTHROP. (1978)49: 39-46. ple size, they did not totally satisfy the theoretical problems other researchers saw in the use of the Grewal-Smith statistic. Notably, Zegura ('73) utilized the equation of Balakrishnan and Sanghvi's B' rather than the Grewal-Smith equation in order to analyze his data. More recently, Sjsvold ('73) considered in depth the statistical approaches used in earlier materials. He suggested modifying the angular transformation of the frequencies to compensate for the magnitude of the frequency. He also suggested new ways of determining the variance of the mean measure of divergence for the Grewal-Smith statistic and finally suggested a method of determining whether two samples from a population were significantly different. He also recommended a number of alternative transformations for the frequencies to be transformed in a way that would better stabilize the variance among others, the transformation of Anscombe ('48)and that of Freeman and Tukey 39 40 MICHAEL FINNEGAN AND KEVIN COOPRIDER (‘50). Green and Suchey compared the various transformations by looking a t the difference between the actual and assumed variance for the transformed frequencies. They utilized, in addition to the Freeman and Tukey [131 and Anscombe [121 methods a Barlett transformation [21 which is simply a correction to the already existing Grewal-Smith transformation. They determined that the Freeman and Tukey [131 transformation should be used to transform trait frequencies in population comparisons. The statistical work by Green and Suchey and Sjeivold suggests that theoretical problems now exist in the comparison of tabulated frequencies where various transformations and statistics have been employed in the final Grewal-Smith statistic. This seemingly suggests that much of the previous work would have t o be redone utilizing these new statistics. The purpose of this paper is to test empirically the transformations and equations which have most often been used to ascertain whether they are similar enough to give us t h e same relative positional outcome for the mean measure of divergence among populations. MATERIAL AND METHODS By adapting a computer program designed by Finnegan (’721, we obtained “equation matrices” from the samples in four data sets. These sets were chosen on the criteria of availability, on the number of samples employed, the number of traits scored in each sample, and the actual size of the samples concerned. The selected data sets include: (1) Finnegan’s Northwest Coast data (‘72) consisting of 15 samples, 42 cranial traits, with the sample size varying from 12 to 107; (2) four Southwest population samples studied by Birkby (’73) utilizing 48 cranial traits, with sample sizes ranging from 50 to 158 individuals; (3) Suchey’s (‘75) California samples utilizing 29 cranial traits, with 27 samples varying in size from 20 to 135individuals; and (4) five Northwest Coast samples utilizing 30 infracranial traits with sample sizes a t 50 individuals for each population as reported by Finnegan (’74b). In each of the above data sets, the N for any particular trait may be less than the N for the sample. An example of an “equation matrix” appears in table 1.The groups of 13 numbers represent the distance computed between two TABLE 1 Distance matrixgenerated by the 13equations listed in the appendix P1 VI (1) 0.060 (2) 0.016 (3) 0.051 (4) 0.034 (5) 0.017 (6) 0.123 (7) 0.724 (8)0.342 (9) 0.008 (10) 0.055 (11) 0.015 (12) 0.059 (13) 0.043 P1 W50 (1) 0.057 (2) 0.023 (3) 0.051 (4) 0.038 (5) 0.028 (6) 0.120 (7) 0.687 (8) 0.342 (9) 0.009 (10) 0.050 (11) 0.014 (12) 0.047 (13) 0.027 P1 W78 (1) 0.044 (2) 0.008 (3) 0.038 (4) 0.028 (5) 0.011 (6) 0.105 (7) 0.525 (8) 0.345 (9) 0.006 (10) 0.041 (11) 0.011 (12) 0.187 (13) 0.168 VI 0.084 0.012 0.072 0.05 1 0.028 0.145 1.003 0.397 0.010 0.067 0.021 0.033 0.014 VI 0.080 0.000 0.069 0.051 0.022 0.142 0.965 0.340 0.009 0.063 0.020 0.198 0.156 W50 0.059 0.010 0.051 0.037 0.015 0.122 0.710 0.350 0.008 0.052 0.015 0.202 0.174 These raw data are based on Birkby (‘73) who utilized 48 cranial traits in four samples. Here we pooled the sexes using the trait frequency from the left side only. corresponding samples using the 13 equations listed in the APPENDIX. The equations include t h e original Grewal-Smith statistic, and four “improvements” by various authors and other new transformations and equations (APPENDIX). The number in brackets listed with each equation is used when reference is required. Some equations were scaled up or down by a constant to provide distance measures of comparable magnitude. In some equations 0.001 was substituted for 0.0 frequencies so that division by 0.0 could be avoided. Each equation produces a geometric model 41 DISTANCE EQUATION COMPARISONS in N-1 dimensions, where N is the number of samples in the data set. For instance, the four southwest samples produced a model of the form seen in figure 1, where the lines correspond to the distances between population samples. Because the computed distances are relative rather than absolute, models with similar “shapes,” arising from two different equations, can be said to be generating similar genetic distances. One method of measuring the similarity of models is to rank the distances generated by each equation, and then employ a nonparametric test on each pair of equations to determine the correlation between the ranks. High correlation indicates model-similarity. We have chosen Spearman’s rho as a statistic of comparison. By including data sets differing with respect to sample size, number of traits considered, and the number of samples involved, we should detect the effect of any or all of these variables on the resultant distances. RESULTS The results of the correlation analysis indicate a high degree of similarity among virtually all equations in all major data sets (tables 2, 3). The correlations are significant a t the 0.001 level in 100%of the comparisons utilizing Finnegan’s cranial data and Suchey’s data (table 2). The Finnegan infracranial subgrouping, where sides and sexes were pooled, produced correlations significant a t the 0.001 level in 100%of the pairings (table 3). Birkby’s data, comprised of sexes and sides combined, produced correlations significant a t W50 W50 pi .060 W5 0 VI PI ,051 VI W50 W 78 Fig, 1 Four models representing the distance displays produced by the equations [11, [31, [81, and 1121, for Birkby’s populations P1,V1,W50, and W78.These correspond with distances produced in table 1. Each model was constructed in three dimensions and vertically reduced to the two dimensional plane defined by the points P1, V l and W50.The scale for model [Sl is reduced by a power of 10. 42 MICHAEL FINNEGAN AND KEVIN COOPRIDER TABLE 2 Correlation matrices of distancrsgenerated by the 13equations listed in the appendix Eouation 1 2 3 4 5 6 7 8 9 10 11 12 13 1 3 2 0.7908 0.9997 0.9747 0.8499 1.0000 1.0000 0.8262 0.9569 0.9855 0.9993 0.5880 0.5063 0.7028 0.9998 0.7131 0.7948 0.8372 0.9771 0.9801 0.8531 0.7908 0.9998 0.7908 0.9998 0.5391 0.8236 0.8467 0.9568 0.8014 0.9849 0.7852 0.9988 0.4976 0.5833 0.5252 0.5020 6 5 4 8 9 10 11 12 13 0.8524 0.5937 0.8515 0.7296 0.7129 0.8524 0.8524 0.9319 0.7560 0.9324 0.8479 0.8286 0.9319 0.9319 0.8078 0.9873 0.7423 0.9877 0.8796 0.8603 0.9873 0.9873 0.8624 0.9714 0.9999 0.7024 0.9997 0.8693 0.8505 0.9999 0.9999 0.8546 0.9319 0.9876 0.8644 0.9454 0.8723 0.9986 0.9995 0.8644 0.8644 07234 0.8392 0.8725 0.8641 0.6888 0.9957 0.6994 0.9408 0.9532 0.6888 0.6888 0.5851 0.7292 0.7229 0.6884 0.9454 7 0.8695 0.8509 1.0000 1.0000 0.9425 0.9525 0.7028 0.7028 0.8772 0.8591 0.9998 0.9998 0.9977 0.8696 0.8696 0.8509 0.8509 0.8862 1.0000 0.9747 0.8499 0.9747 0.8499 1.0000 0.7807 0.6310 0.8263 0.8263 0.9337 0.8845 0.9570 0.9569 0.9541 0.8550 0.9855 0.9855 0.9721 0.8460 0.9993 0.9993 0.5257 0.5207 0.5880 0.5880 0.4519 0.5156 0.5063 0.5063 0.7708 0.8522 0.8385 0.6116 0.5049 0.9739 0.9555 0.9875 0.5797 0.5999 0.5950 0.5306 0.5284 0.5112 0.9710 Correlations from Suchey's ('75) data are presented above the diagonal with correlations from Fmnegan's 1'72) cranial data below. In each case correlations are based on sexes combined using the left slde only. and all correlations are significant at the 0.001 level TABLE 3 Correlation matrices of distances generated by the 13 equations listed in the appendix Equation 1 1 2 3 4 5 6 7 8 9 10 11 12 13 0.9879 1.0000 1.0000 1.0000 1.0000 1.0000 0.9394 0.9758 0.9879 1,0000 1.0000 0.9879 2 3 1.0000 1.0000 1.0000 0.9879 0.9879 1.0000 0.9879 1.0000 0.9879 1.0000 0.9879 1.0000 0.9273 0.9394 0.9879 0.9758 1,0000 0.9879 0.9879 1,0000 0.9879 1.0000 1.0000 0.9879 4 5 6 7 8 1.0000 0.9429 1.0000 1.0000 1,0000 0.9429 1.0000 1.0000 1.0000 0.9429 1.0000 1.0000 0.9429 1.0000 1.0000 1.0000 0.9429 0.9429 1.0000 1.0000 1,0000 1.0000 1.0000 1.0000 0.9394 0.9394 0.9394 0.9394 0.9758 0.9758 0.9758 0.9758 0.9879 0.9879 0.9879 0.9879 1,0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9879 0.9879 0.9879 0.9879 9 0.8857 0.8857 0.8857 0.8857 0.9429 0.8857 0.8857 0.9394 0.9273 0.9394 0.9394 0.9273 1 0.9429 0.9429 0.9429 0.9429 0.8286 0.9429 0.9429 0.7714 0 1 1.0000 1.0000 1.0000 1.0000 0.9429 1.0000 1.0000 0.8857 0.9429 1 12 13 1.0000 1.0000 1.0000 1.0000 0.9429 1.0000 1.0000 0.8857 0.9429 1.0000 - 0.0286 -0.0286 -0.0286 -0.0286 - 0.0286 - 0.0857 -0,0286 - 0.0286 -0.3714 0.1429 -0.0286 - 0.0286 1.oooo 0.9879 0.9758 0.9879 0.9758 0.9879 1.0000 0.9879 1.0000 0.9879 -0,0286 -0.0286 -0.0286 -0.0857 -0,0286 - 0.0286 -0,3714 0.1429 -0.0286 -0.0286 0.9879 Correlations below the diagonal are based on Fmnegan's ('74b) infracranial data, while those above are based on Birkby's ('731 cranial data. sides and sexes combined in each case. See text for a discussion of the significance levels. the 0.001 level in 38.46%of the cases, an additional 32.05% were significant a t the 0.01 level and an additional 2.56%were significant a t the 0.05 level (table 3). The most aberrant subgrouping of the Birkby data - sexes combined, using the left side only - produced the correlations presented in table 4. Here, 19.23% of the correlations were significant a t the 0.001 level, an additional 5.13%were significant at the 0.01 level, and an additional 14.10% were significant at the 0.05 level. Cumulatively, over all data sets presented in the tables, 71.28% of all correlations were significant at the 0.001 level, an additional 7.44% met the 0.01 level and an additional 3.85% were significant at the 0.05 level. This suggests t h a t any of the equations may be used with the assurance of obtaining very reasonable information with the following possible exception. Analysis of Birkby's subset data ('73) (table 4) revealed that equations [21,@I, t121 and [131 give somewhat different information. The Birkby populations are all relatively large, as is the number of traits scored, although the number of populations is small. It is not known a t this time whether one of the above factors, a combination of these factors, or some other variable(s) are affecting these correlations. The chance that this is a particularly unusual set of data is quite small, since each correlation pair computed from the Birkby sample (combined sex, combined sides, etc.) showed these equations to be somewhat divergent from the data sets of Finnegan and Suchey. The lack of significance may be due to the small number of samples in his study and the fact that Birkby utilized about 30%more traits than Finnegan ('72) used in his cranial study. 43 DISTANCE EQUATION COMPARISONS DISCUSSION w -r o! 0 o-tot-w ON? olom 100 4 Indo 0 0 0 0 0 0 I 1 v v v wwm-r !&!&La rnrnu?? 0 0 0 0 1 I II II mmm+ fXc4-t- I I 1 1 The primary purpose of this paper has been to test whether one or another equation appears to be more productive in terms of the empirical results which can be expected from one of the equations utilized. Little real difference is found, in terms of actual data utilizing the above equations, even though there is a range of difference in the absolute numerical value between population sample distances depending on the equation used. It must be remembered that these distances are only relative to the other populations with which these distances were generated. The analysis of biological difference must then come from these distances in terms of ordering but not in terms of absolute value. It is interesting that although theoretical statistical arguments favor certain equations based on ratios of the assumed variance to the actual variance, in practice little difference is found. From this we conclude that (1) it does not matter which statistical equation is utilized when sample sizes are as large as the population samples utilized in this discussion; (2)the frequency of each trait can either be large or small, corrected or uncorrected, and not make any real difference in between population comparisons, at least t o the extent of the frequency magnitude and sample sizes in this analysis, (3) any of these statistical equations can be used with confidence when a relatively large number of traits are scored for each skeleton, i.e., 25-45 traits; (4) caution should be exercised when choosing a transformation for analysis of a few population samples, as seen in the Birkby data, although we have not defined limits based on the above data sets. We can now adequately show that the population sample separation seen in this and previous studies indeed reflects something about biological separation and not numerical separation based on the type of statistic used. This has been alluded to by a number of researchers, and biological separation fits well with linguistic separation using glottochronology and lexicostatistics (Finnegan, '72), and with other population studies, primarily metric studies of the skeleton (Cybulski, '73; Corruccini, '73, '74; Ortner and Corruccini, '76, etc.). Dental traits have been utilized, both metric and non-metric, t o produce results similar to both skeletal metrics and skeletal non-metrics (Ortner and Corruccini, '76; 44 MICHAEL FINNEGAN AND KEVIN COOPRIDER Greene and Armelagos, '72). Blood studies have also shown separation similar t o nonmetric trait variation of the skeleton (Finnegan, '72; Hulse, '55). The distribution of material culture remains has also suggested divisions of population samples which correlate highly with both non-metric cranial traits and linguistic differences (Finnegan, '72, '74a; Hirsch, '54). From the above we conclude the following: (1)various forms of the Grewal-Smith statistic are quite similar and seem not to vary greatly by empirical testing, where angular transformations of the frequency data are utilized. No dependency is suggested on either the number of non-metric traits or the range in frequency of those non-metric traits when utilizing one of the above equations. (2) The argument for standardizing the variance in each of t h e equations seems not to be particularly important when applied to real populations. (3) The equation which seems most preferable to us is equation [51 for the following reason: not only does this statistic give a biological measure in numerical form between populations which are highly comparable by correlation with all other examples, but it presently seems to be the most widely used equation. ACKNOWLEDGMENTS The authors wish to acknowledge the program assistance of S. A. McGuire and Ms. Lorraine Douglas for typing various drafts of this article. The infracranial data was collected with the support of a grant from the Smithsonian Research Foundation Fellowship SFC-3-0875and COAA Grant 4F0875. LITERATURE CITED Anscombe, F. J. 1948 The transformation of Poisson, binomial and negative-binomial data. Biometrika, 35: 246-254. Berry, A. C., and R. J. Berry 1967 Epigenetic variation in t h e human cranium. Journal of Anatomy, 101: 361-379. Birkhy, W. H. 1973 Discontinuous Morphological Traits of the Skull as Population Markers in the Prehistoric Southwest. Ph.D. Dissertation, University of Arizona. Buikstra, J. 1972 Hopewell in the Lower Illinois River Valley: A Regional Approach to the Study of Biological Variability and Mortuary Activity. Ph.D. Dissertation, University of Chicago. Cavalli-Sforza, L. L., L. A. Zonta, F. Nuzzo, L. Bernini, W. W. W. de Jong, P. Meera Khan, A. K. Ray, L. N. Went, M. Siniscalco, L. E. Nijenhuis, E. van Loghem and G. Modiano 1969 Studies on African Pygmees. I. A Pilot investigation of Babinga Pygmees in the central African Republic (with analysis of genetic distances). American Journal of Human Genetics, 21: 252-274. Constandse-Westermann, T. S. 1972 Coefficients of Biological Distance. Oosterhout: Anthropological Publications, New York: Humanities Press. viii + 142 pp. Corruccini, R. B. 1973 Size and Shape in Similarity coefficients based on metric characters. Am. J. Phys. Anthrop., 38: 743-754. 1974 An examination of the meaning of cranial discrete traits for human skeletal biological studies. Am. J. Phys. Anthrop., 40: 425-446. Cybulski, J. S. 1973 Skeletal Variability in British Columbia Coastal Populations: A Descriptive and Comparative Assessment of Cranial Morphology. Ph.D. dissertation, University of Toronto. Finnegan, M. 1972 Population Definition on the Northwest Coast by Analysis of Discrete Character Variation. Ph.D. dissertation, University of Colorado, Boulder. - 1974a A Migration Model for Northwest North America. In: International Conference on t h e Prehistory and Paleoecology of Western North American Arctic and Subarctic. S. Raymondand P. Schledermann, eds. University of Calgary Press, pp. 57-73. 1974b Discretenon-metric variation of t h e postcranial skeleton in man. Am. J. Phys. Anthrop., 40: 135136 (Abstract). Fisher, R. A. 1925 Statistical Methods for Research Workers. Oliver and Boyd, London. Freeman, M. F., and J. W. Tukey 1950 Transformations related to t h e angular and square root. Ann. Math. Stat., 21: 607-611. Gaherty, G. 1974 Infracranial discrete traits in seven African populations. Am. J. Phys. Anthrop., 41: 480-481. Green, R. F., and J. M. Suchey 1976 The use of inverse sine transformations in the analysis of non-metric cranial data. Am. J. Phys. Anthrop., 45: 61-68. Greene, D. L., and G. J. Armelagos 1972 The Wadi Halfa Mesolithic Population. Research Report No. 11, Department of Anthropology, University of Massachusetts, Amherst, July 1972. Grewal, M. S. 1962 The rate of genetic divergence of sublines in t h e C57BL strain of mice. Genetics Research, 3: 226-237. Hiernaux, J. 1965 Une nouvelle mesure de distance anthropologique entre populations, utilisant simultanement des frequences geniques, des pourcentages de traits descriptifs et des moyennes metriques. C. R. Academic Science, Paris, 260: 1748-1750. Hirsch, D. I. 1954 Glottochronology and Eskimo and Eskimo - Aleut prehistory. Amer. Anthropologist, 56: 825-838. Hulse, F., S. 1955 Blood-Types and mating patterns among Northwest Coast Indians. Southwestern Journal of Anthropology, 11: 93-104. Lane, R. A,, and A. J. Sublett 1972 Osteology of social organization: residence pattern. American Antiquity, 37: 186-201. Malyutov, M. B., V. P. Passekov and Yu. G. Rychkov 1972 On reconstruction of evolutionary trees of human populations resulting from random genetic drift. In: The Assessment of Population Affinities in Man. J. S . Weiner and J. Huizinga, eds. Clarendon Press, Oxford. Oliver, L. D., and W. W. Howells 1960 Bougainville populations studied by generalized distance. Actes, VIe Congres Intern. Des Sci. Anthrop. Ethnol., Paris, 1: 497-502. Ortner, D. J., and R. S. Corruccini 1976 The Skeletal Biology of t h e Virginia Indians. Am. J. Phys. Anthrop., 45: 717-722. 45 DISTANCE EQUATION COMPARISONS Sanghvi, L. D. 1953 Comparison of genetical and morphological methods for a study of biological differences. Am. J. Phys. Anthrop., 11: 385-404. Sjovold, T. 1973 The occurrence of minor non-metrical variants in the skeleton and their quantitative treatment for population comparisons. Homo., 24; 204-233. Spuhler, J. N. 1972 Linguistic and geographical distances in native North America. In: The Assessment of Population Affinities. J. S. Weiner and J . Huzinga, eds. Clarendon Press, Oxford. Suchey, J. M. 1975 Biological Distance of Prehistoric Central California Populations Derived from Non-metric Traits of the Cranium. Ph.D. dissertation, University of California-Riverside. Zegura, S. 1973 A comparison of distance matrices derived from craniometric measurements and cranial observations. Paper presented a t the 42nd annual meeting of the American Association of Physical Anthropologists, Dallas, Texas. Abstract in Am. J. Phys. Anthrop.,40: 157. Adaptation by Finnegan of the CAB Smith equation in which the average number of individuals over all traits accounts for the variance factor (Finnegan, "72: p. 30). R 0 In each of the following equations: trait i = trait no. under summation j = phenotype (:czeeIi) under summation N1 = total skulls sample 1 Nli = skulls of sample 1with observable trait i Ki pli = % of trait i in sample 1i-4 Nli plij = % of trait i, observationj in sample 1 R = no. of traits for particular data set K1,= count of positive observations for trait i R z (ell - 82i)Z Original Grewal-Smith formula which they claimed allocated too much of the measure to random sampling error (Grewal, '62: pp. 229230). sin-' (1-2plii R APPENDIX H l i and @lI= transformation angles of first sample ith = Grewal's adaptation with variance factor based on total population rather than individual trait population (Grewal, '62: pp. 229230). @lj = sin-' (1-2~1,) Constandse-Westermann formula (from CAB Smith) basing the variance factor on individual trait population. This appears to be the most widely accepted formula a t this time, though experiments with other transformations may show it deficient (ConstandseWestermann, '72: p. 119). Oliver and Howell's use of the Fisher transformation which Constandse-Westermann had some trouble deciphering from their original publication. Fisher transformation is apparently well-known and useful in other situations (Oliver and Howells, '60: pp. 498500; Fisher, '25). R 2 t [cos-1 L 1=1 4- (7) j=l 8 = sin-' (1-2p1J with Bartlett's b r r . of p1, = 1/4nl, when pll = 0.000 pi1 = 1-1/4n1, pi1 = 1.000 Bartlett's correction for stabilizing the variance of observations (traits) whose percentage positive recordings were either 0,000 or 1.000 (Sjevold, '73: pp. 224-226). R z loll -82,P Malyutov et al. formula for evolutionary tree building, designed for situations of multiple alleles a t a single locus, but perhaps suitable for non-metrics (Malyutov et al., '72: p. 50; Constandse-Westermann, '72: p. 105). 10,000 R ~ i=l Pli - ~ 2 i pHi-pLi )p Hiernaux's Ag distance utilized by Gaherty. We have followed the example of Gaherty in utilizing the range of the particular data set, rather than a world-wide range (Hiernaux, '65: pp. 1748-1750. 46 MICHAEL FINNEGAN AND KEVIN COOPRIDER Gaherty's use of the very simple unmodified Euclidean distance (Gaherty, '74: p. 4). R 011 = sin-' (l-Z(K1, + 3/8)/(N1, + 3/41) Sanghvi's formula derived from his previous x2 work. This equation is again devised for multiple allele use (Sanghvi, '53). R 2 . 1 (11=I This is the same basic equation as GrewalSmith 12) with substitution of the Anscombe transform and its variance (Sjovold, '73: pp. 212-213; Anscombe, '48). R L i=l 2 t 1j=l (11) R Cavalli-Sforza et al. distance equation designed for multiple alleles and binomial cases (Cavalli-Sforza et al., '69; Spuhler, '72). 81, = [(o - 0 1, 1 2, i' -i 1 + 1/2 + -N z l + 1/2 Nil R U2sin-l (l.ZKli/(Nli + 1)) + 1/2sin-' (1-2 (Klk ij (13) + l ) / ( N ] , + 1)) This is similar to Grewal-Smith with insertion of the Freeman and Tukey transformation and its corresponding variance (Sjavold, '73: p. 312; Freeman and Tukey, '50).