Brief communication Coexistence of two distinct patterns in the surname structure of Sicily.код для вставкиСкачать
AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY 120:195–199 (2003) Brief Communication: Coexistence of Two Distinct Patterns in the Surname Structure of Sicily Angelo Pavesi,1* Paola Pizzetti,1 Enzo Siri,2 Enzo Lucchetti,1 and Franco Conterio1 1 2 Department of Evolutionary and Functional Biology, University of Parma, I-43100 Parma, Italy Department of Environmental Sciences, University of Parma, I-43100 Parma, Italy KEY WORDS human population; surnames; migration rate; population sampling; multidimensional scaling ABSTRACT The extent of variation in the migratory movements that occurred in Sicily was evaluated using surname data taken from the telephone directories of the 390 communes of the island. The surname distribution of each commune was linearized by a log-log transformation, and a significant fit to a linear regression model was found in almost all cases. Interestingly, the slope of the regression line appeared to be a sensitive indicator of the different level of isolation associated with each Sicilian commune. By this approach, two distinct groups of communes, showing a higher or lower degree of isolation, were obtained, and two independent analyses of the surname structure of Sicily were carried out. A first multidimensional scaling analysis, based on the more isolated com- munes, yielded evidence for a more ancient pattern, characterized by a geographical gradient along the east-west axis. The same analysis, addressed to the less isolated communes, instead highlighted a wide network of interactions between geographically distant zones of the island. The fitting of the surname distribution to the log-log model allowed for the detection of a narrow subset of 35 Sicilian communes, whose significantly higher degree of isolation was statistically proved by the parallelism test. We believe that a genetic analysis focused on such specific zones of the island could reveal ancient patterns of differentiation, thus helping to answer the controversial question of the genetic history of Sicily. Am J Phys Anthropol 120:195–199, 2003. © 2003 Wiley-Liss, Inc. Sicily is the largest island in the Mediterranean Sea, and for a long time it was the meeting point of ancient civilizations. The complex historical events that occurred in the past are likely stored in the genetic pool of the present-day Sicilian population. However, independent studies of the genetic structure of the island have led to controversial results. Based on autosomal markers such as blood group or human leucocyte antigen (HLA) loci, Piazza et al. (1988) highlighted a genetic distinctiveness of east Sicily, which was explained as the result of a consistent introduction of genetic variants during Greek colonization (eighth century BC). On the contrary, further studies on the geographical distribution of other genetic systems did not support the view of a west-east difference within island’s population (Rickards et al., 1992; Walter et al., 1997). The lack of a clear genetic gradient was explained as an effect of a long period of short-range migrations, beginning from the Middle Ages onward and causing gene flow between local populations (Rickards et al., 1998). Surnames, whose transmission from father to children is roughly similar to that of genetic traits, have also been used to describe the population structure of Sicily. Guglielmino et al. (1991) analyzed the distribution of surnames taken from consanguineous marriages, and pointed out a gradient in the west-east axis of the island. A serious limitation of this approach lies in the fact that surnames are an inadequate marker to reconstruct ancient demographic events, because their use in Italy was institutionalized only at the end of the sixteenth century. However, analyses of surname data, which are easier and less expensive to collect than data from genes, can offer insights into microevolutionary processes acting on human populations, such as genetic drift and migration (Allen, 1988; Piazza et al., 1987). According to this view, and based on the vast number of surnames from telephone guides, RodriguezLarralde et al. (1994) estimated the degree of microdifferentiation within each Sicilian province, and pointed out that geographic distance has an important effect on surname variation in Sicily. © 2003 WILEY-LISS, INC. Grant sponsor: National Research Council of Italy; Grant number: CNR 99.3801.PF36. *Correspondence to: Angelo Pavesi, Department of Evolutionary and Functional Biology, University of Parma, Parco Area delle Scienze 11/A, I-43100 Parma, Italy. E-mail: firstname.lastname@example.org Received 28 may 2001; accepted 18 February 2002. DOI 10.1002/ajpa.10120 Published online in Wiley InterScience (www.interscience.wiley. com). 196 A. PAVESI ET AL. In the present work, we reanalyzed the surname data from Sicily, using an updated list of telephone users, with the aim of more accurately appreciating the extent of variation in the migration phenomenona that occurred within the island’s population. On the basis of a nonstandard statistical approach, the shape of the distribution of surnames was used as a sensitive indicator of the different level of isolation associated with each Sicilian commune. This information made possible a finer elucidation of the complex pattern of surname relationships between the various subregions of the island, which could explain the divergent conclusions previously obtained by the genetic polymorphism distributions. The surnames of all individual Sicilian subscribers, updated to 1994 and filtered by removing commercial customers, were extracted from the computerized records of the Italian Telephone Company (813,364 records accounting for 1,592,798 users with 79,717 different surnames). A list including the frequency of different surnames was obtained for each of the 390 communes. After the exclusion of the eight communes located on the minor islets, a total of 382 localities was analyzed with the log-log transformation model (Fox and Lasker, 1983; Barrai et al., 1989), which linearized the frequency of surnames over their occurrence. With this method, the surname structure of each Sicilian commune was described by a, the intercept of the regression line on the y axis, and b, the slope of the regression line (goodness of fit was tested by variance analysis). Sixteen communes, having an extremely low population density and showing a surname distribution poorly fitting the regression model, were not considered. The exclusion of a larger set of 54 communes was based on the fact that it contains large or industrial towns, whose surname structure is deeply affected by massive immigration from a wide surrounding area. Our interest was thus devoted to the 312 remaining communes which exhibited a significant fit to the regression model in almost all cases (311 communes). A first analysis, based on the Spearman rank coefficient, revealed a strong correlation (rs ⫽ 0.98) between parameter a and the number of surnames (S) found in a given commune, a quantity which, in turn, appeared to be associated with the total number of telephone users (N) in that commune (rs ⫽ 0.94). No correlation was found, in contrast, between parameter b and the number of surnames (rs ⫽ ⫺0.03), indicating that the range of variation of the regression slope is largely independent of the size of the population. Interestingly, parameter b appeared to be a valuable tool for estimating the extent of variation between the surname structures of the Sicilian communes under examination. This notion was suggested by a visual inspection of the surname distribution from the two communes with the highest (⫺0.46) and the lowest value (⫺1.83) of the regression slope. Geraci Siculo (b ⫽ ⫺0.46; see Fig. 1A) is a mountain village located in the High Madonie region of the province of Palermo. Its surname structure was characterized by a relatively low number of surnames represented only once (64 rare surnames) and a proportionally high number of the most frequent ones (28 surnames with a range of redundancy from 9 –32). These features, combined with a rather homogeneous surname structure (743 telephone subscribers with only 157 different surnames), supported the hypothesis that Geraci Siculo experienced a long history of marked isolation. Sant’Alessio Siculo (b ⫽ ⫺1.83; see Fig. 1B) is a small town situated on the Mediterranean coast of the province of Messina. Compared to Geraci, it showed a strong increase in rare surnames (288 surnames occurring once) and a lower amount of the most abundant ones (7 surnames with a range of redundancy from 9 –32). Since a high frequency of unique surnames is presumably due to recent immigration, the heterogeneous structure of Sant’Alessio (a total of 419 different surnames for 745 telephone subscribers) can be viewed as the result of shortrange migrations of individuals, leading to a consistent gene exchange between local populations. These observations suggest that increasing values of parameter b should reflect a gradual increase in the level of isolation in the 311 Sicilian communes under examination. By choosing the median value of b (⫺1.02) as threshold, two large groups of communes, each containing 155 entities, were obtained, and two independent analyses of the surname structure of Sicily were then carried out. Though surnames are relatively recent population markers, it is probable that cultural or genetic differences existing before the surname diffusion could persist in those communes with a b value above the median, since they include the more isolated localities of Sicily. The effects of isolation by distance should be extremely weak when considering the alternative set of communes (b below the median value), whose surname distribution suggests recent events of population admixture. If this premise is correct, we would obtain two different pictures of the population structure of Sicily, depending on the set of communes under examination. As shown in Figure 2, the entire island was divided into 12 geographical regions. Provinces showing a large extension along the latitudinal or longitudinal axes were subdivided into a western and an eastern part (Palermo, Messina, and Agrigento) or a northern and a southern part (Catania). A joined area was obtained from the two small provinces situated in the extreme south corner of Sicily, i.e., Siracusa and Ragusa. The three remaining regions correspond to the provinces of Trapani, Caltanissetta, and Enna. Using the two groups of communes selected above, two distinct collections of surnames were obtained for each of the 12 regions (Table 1). The relationships between surname frequencies of these regions were evaluated by the correlation coefficient R (Chen and Cavalli-Sforza, 1983). Two dis- SURNAME STRUCTURE OF SICILY 197 Fig. 1. Log-log surname distribution and linear regression analysis of the two Sicilian communes showing the highest (Geraci Siculo, A) and lowest (Sant’Alessio Siculo, B) value of parameter b, the slope of the regression line. Fig. 2. Sicily. Twelve areas used to investigate surname structure of tinct matrices of similarity were obtained, and some differences between them were apparent at a preliminary inspection. Referring to the matrix based on the more isolated communes, the extreme southern region of Sicily (Siracusa and Ragusa) showed a significant correlation with the southern part of Catania (R ⫽ 0.43), but a low correlation with all the remaining regions (R values from 0.21– 0.29). The alternative matrix, conversely, revealed a good correlation of Siracusa and Ragusa not only with south Catania (R ⫽ 0.41) but also with more distant zones of Sicily, such as north Catania (R ⫽ 0.44), east Messina (R ⫽ 0.35), Enna (R ⫽ 0.36), and west Palermo (R ⫽ 0.35). This latter result could be due to the migration of individuals or small groups from the extreme south of the island to northeast Sicily, probably along the coast, or to northwest Sicily through the interior. It is interesting to note that different patterns of migration were also inferred by analyzing the province of Enna, located in the heart of Sicily. A strong isolation was apparent from an inspection of the first matrix, since Enna showed a low degree of correlation even with the nearest provinces, such as Caltanissetta (R ⫽ 0.31) or south Catania (R ⫽ 0.30). Examination of the alternative matrix instead suggested a large network of interactions with western and eastern regions. In fact, Enna exhibited a good correlation with the two areas of Palermo (0.43 and 0.36) as well as all five regions in the eastern half of Sicily (R values ranging from 0.35– 0.48). A dual graphic representation of the surname structure of Sicily was then obtained with the non- 198 A. PAVESI ET AL. TABLE 1. Number of different surnames and total number of telephone users in 12 regions of Sicily, as obtained from communes with a higher (1) or lower (2) degree of isolation1 1 Trapani West Palermo East Palermo West Agrigento East Agrigento Caltanissetta Enna West Messina East Messina North Catania South Catania Siracusa-Ragusa Total 1 2 Number of communes S 9 19 18 15 14 15 6 17 16 7 7 12 155 4,608 5,235 2,571 3,074 2,962 2,595 2,028 2,568 2,458 2,238 4,072 4,579 38,988 N Number of communes S N 40,549 43,688 20,267 27,716 29,853 21,366 13,677 16,348 15,343 14,582 32,359 43,150 318,898 7 21 10 3 3 2 11 18 39 24 7 10 155 1,653 5,139 4,375 1,086 1,526 926 4,449 4,365 6,850 9,389 2,057 5,353 47,168 10,033 31,798 26,779 5,320 6,512 3,906 30,663 27,410 47,570 74,935 10,430 34,101 309,457 S, surnames; N, total number of telephone users. metric multidimensional scaling (NMDS) technique (Kruskall, 1964). A first NMDS analysis, which focused on the distance matrix based on the more isolated communes, yielded a map in which four clusters are easily recognizable (Fig. 3A). Each cluster is formed by contiguous regions, forming a southeastern cluster (Catania, Enna, and Siracusa-Ragusa), a northeastern cluster (Messina), a western cluster (Trapani, Palermo, and west Agrigento), and a southern cluster (east Agrigento and Caltanissetta). The main division provided by the projection of points on the first axis of the map was along an east-west direction, as pointed out by a significant correlation between coordinates on axis 1 and coordinates along the geographical latitudinal axis (rs ⫽ ⫺0.75). A partial deviation from this pattern was found for Trapani and west Palermo, which were placed in close proximity to the center of the map. Such a position, weakly concordant with a geographical location at the extreme west of Sicily, was ascribed to a relatively high degree of correlation with south Catania (R ⫽ 0.36 for west Palermo; R ⫽ 0.37 for Trapani), which was only slightly lower than those with the surrounding regions. A different pattern of the surname structure of Sicily was provided by NMDS analysis of the distance matrix based on the less isolated communes. In this case, 8 of the 12 regions, albeit geographically distant, tended to form a cluster at the center of the map (Fig. 3B). The close relatedness of most of the regions on axis 1 of the map suggested longrange migratory movements involving the largest provinces of Sicily (Palermo, Messina, and Catania), as well as the extreme southern region (SiracusaRagusa) and the interior region (Enna). On the same axis, the different position of the two areas of the province of Agrigento was explained as the result of short-range migrations in western and eastern directions, respectively. In accordance with these observations, the correlation between coordinates on axis 1 and coordinates along the geographical latitudinal axis was statistically nonsignificant (rs ⫽ 0.20). Fig. 3. Multidimensional scaling analysis of surname relationships between 12 areas of Sicily. A: Graphic map based on 155 more isolated communes. B: Graphic map based on 155 less isolated communes. Rather than a general outline of the surname structure of Sicily (Rodriguez-Larralde et al., 1994), these results highlight the coexistence of an ancient SURNAME STRUCTURE OF SICILY substrate, characterized by a major east-west gradient, together with a more recent one, indicative of a dense network of contacts between local populations. Obviously, even a rough dating of the demographic processes associated with the more ancient substrate is a problematic task. As mentioned above, surnames are relatively young markers, and their distribution often reflects cultural variations, which are transmitted partly vertically, and partly horizontally (Cavalli-Sforza and Feldman, 1981). Though the cultural dichotomy between east and west Sicily, still evident today in many respects, had its origins during prehistoric times (Tusa, 2000), the hypothesis of a corresponding genetic differentiation should be tested by an accurate population sampling based on highly polymorphic genetic markers. The last aspect of our study yielded valuable information about the criterion to be followed for the collection of population samples in Sicily. By using the parallelism test (test of equality of the regression coefficiens), the slopes of the regression line within the group of the more isolated communes were compared with each other, and a narrow subset of 35 localities with a significantly higher degree of isolation was selected (data not shown). The deep isolation of these communes, evenly distributed across the island, was further supported by the fact that surnames represented only once covered a low fraction (about 10%) of the total amount of subscribers, whereas the 20 most frequent surnames accounted for a fraction often above 50%. Based on classic markers (e.g., blood groups) or molecular polymorphisms at the DNA level, a genetic analysis focused on such specific zones of the island should at least reveal geographic patterns of differentiation, thus helping answer the controversial question of the genetic history of Sicily. 199 ACKNOWLEDGMENTS We are grateful to the Italian Telephone Company (SEAT) for allowing us to use their computerized records listing the surnames of their customers. The critical comments of the anonymous referees are also gratefully acknowledged. LITERATURE CITED Allen G. 1988. Random genetic drift inferred from surnames in Old Colony Mennonites. Hum Biol 60:639 – 653. Barrai I, Formica G, Barale R, Beretta M. 1989. Isonymy and migration distance. Ann Hum Genet 53:249 –262. Cavalli-Sforza LL, Feldman MW. 1981. Cultural transmission and evolution: a quantitative approach. New York: Princeton University Press. Chen KH, Cavalli-Sforza LL. 1983. Surnames in Taiwan: interpretations based on geography and history. Hum Biol 55:367– 374. Fox WR, Lasker GW. 1983. The distribution of surname frequencies. Int Stat Rev 51:81– 87. Guglielmino CR, Zei G, Cavalli-Sforza LL. 1991. Genetic and cultural transmission in Sicily as revealed by names and surnames. Hum Biol 63:607– 627. Kruskall JB. 1964. Non-metric multidimensional scaling: a numerical method. Psychometrika 29:115–129. Piazza A, Rendine S, Zei G, Moroni A, Cavalli-Sforza, LL. 1987. Migration rates of human populations from surname distributions. Nature 329:714 –716. Piazza A, Cappello N, Olivetti E, Rendine S. 1988. A genetic history of Italy. Ann Hum Genet 52:203–213. Rickards O, Biondi G, De Stefano GF, Vecchi F, Walter H. 1992. Genetic structure of the population of Sicily. Am J Phys Anthropol 87:395– 406. Rickards O, Martinez-Labarga C, Scano G, De Stefano GF, Biondi G, Pacaci M, Walter H. 1998. Genetic history of the population of Sicily. Hum Biol 70:699 –714. Rodriguez-Larralde A, Pavesi A, Scapoli C, Conterio F, Siri G, Barrai I. 1994. Isonimy and the genetic structure of Sicily. J Biosoc Sci 26:9 –24. Tusa S. 2000. Ethnic dynamics during pre- and proto-history of Sicily. J Cult Herit 1:17–28. Walter H, Matsumoto H, Danker-Hopfe H, De Stefano GF, Rickards O. 1997. GM and KM allotypes in nine population samples of Sicily. Ann Hum Biol 24:419 – 426.