Dental Wear Scoring Technique E. C . SCOTT Department ofdnthropology, Uniuersity of Kentucky, Lexington, Kentucky 40506 KEY WORDS Dental wear Inter-observer reliability . Dental attrition . Ordinal scales . ABSTRACT An ordinal dental attrition scoring technique for molar teeth involving a quadrant system is applied t o three Amerind skeletal samples. Molar teeth are visually divided into four sections and each section scored on a 1-10 scale. The score for the whole tooth is the sum of the quadrant scores and ranges from 4-40.Scores are based on the amount of enamel in each quarter of the tooth. The method’s reliability is demonstrated by a paired comparison type of ANOVA, for both intra- and inter-observer repeated measurements. Because the method is reliable, easy t o use, and produces data with lower variances than a 1-8system (such as Molnar’s ”7111, it is recommended for use in the principal axis technique for the analysis of wear data presented in the accompanying paper. The study of dental attrition and culture has not progressed as far as it might largely because (as Wilkinson pointed out in 1972) the area lacks “formalized methodology”specifically, of data collection and data analysis. The biggest problem in the analysis of dental wear data is avoiding the correlation of age and wear. This problem has been methopefully successfully -by shifting the focus to rate of wear rather than degree of wear (Smith, ’72, ’76; Walker, ’78; Scott, ’79). Scott has discussed why the principal axis analysis method is more satisfactory than others (Scott, ’79). For the principal axis analysis to perform most satisfactorily, however, the data should be either interval level or the best approximation which an ordinal scale can give. The present article presents an ordinal scoring procedure for data collection that is reliable and that better approximates interval level data than other available methods. THE TECHNIQUE Molnar’s well known dental wear scoring system involves a 1-8ordinal scale based upon the number of dentine patches and the amount of secondary dentine present on the occlusal surface of a tooth (Molnar, ’71). My system is also ordinal, but involves a 4-40 scale based upon the amount of enamel present on the occlusal surface of the tooth. AM. J. PHYS. ANTHROP. (1979)51: 213-218. There are philosophical as well as practical differences in the two methods. The variable in which we are properly interested in the study of occlusal attrition is the amount of enamel left on the tooth, as the amount of enamel is the best indicator of the functional life of the tooth. Because secondary dentine provides a “second enamel,” to some extent we may be interested in the amount of secondary dentine; however, not all individuals in a population exhibit secondary dentine. It is important to focus upon the amount of enamel present, rather than the amount of secondary dentine present. I suggest the following procedure for recording occlusal attrition of molars: Visually divide the occlusal surface of the molar into four equal quadrants. Score each quadrant according to the 1-10 scale found in table 1. The sum of the four quadrants is the score for that tooth. The primary consideration in the scoring procedure is the amount of enamel in the quadrant. After the major occlusal features have worn off (scores 1-41the amount of dentine exposure relative to the amount of enamel present in the quadrant is considered. Each quadrant is then scored according to the following categories: Worn patch covers onefourth of quadrant or less (5); worn patch greater than one-fourth of quadrant area but still completely surrounded by enamel (6); 213 214 E. C. SCOTT TABLE 1 Attrition scoring technique Description Score 0 1 2 3 4 5 6 I 8 9 10 No information available (tooth not occluding, unerupted, antemortem or postmortem loss, etc.) Wear facets invisible or very small Wear facets large, but large cusps still present and surface features (crenulations, noncarious pits) very evident. It is possible to have pinprick size dentine exposures or “dots” which should be ignored. This is a quadrant with much enamel. Any cusp in the quadrant area is rounded rather than being clearly defined as in 2. The cusp is becoming obliterated but is not yet worn flat. Quadrant area is worn flat (horizontal) but there is no dentine exposure other than a possible pinprick sized “dot.” Quadrant is flat, with dentine exposure one-fourth of quadrant or less. (Be careful not to confuse noncarious pits with dentine exposure.) Dentine exposure greater: more than one-fourth of quadrant area is involved, but there is still much enamel present. If the quadrant is visualized as having three “sides” (as in the diagram) the dentine patch is still surrounded on all three “sides” by a ring of enamel. Enamel is found on only two “sides” of the quadrant. Enamel on only one “side” (usually outer rim) but the enamel is thick to medium on this edge. Enamel on only one “side” as in 8, but the enamel is very thinjust a strip. Part of the “edge” may be worn through a t one or more places. No enamel on any part of quadrant-dentine exposure complete. Wear is extended below the cervicoenamel junction into the root. enamel only partially surrounds worn patch, being present on two “flanks” or “sides” of the patch (7); enamel occurs on only one side of the worn patch (usually on the outer rim) and the enamel is thick to medium on this edge (8);only a thin strip of enamel is present in the quadrant, and it may be worn through at one or more places (9) ; no enamel is left on any part of the quadrant (10). THE DATA The scoring system was applied to three collections of American Indian skeletal material. One collection, Indian Knoll (15 OH 2, Kentucky), has very heavy attrition. The other collections showed considerably lower occlusal attrition, and were from a Ft. Ancient culture site in Kentucky (Hardin village site, 15 GP 22) and a Mississippian culture site in Mis souri (Campbell site, 23 P M 5). The collections were chosen because of their relatively large sizes and because they represented individuals from cultures following different subsistence strategies. Although there is the possibility that the Indian Knoll site is multicomponent (Robbins, ’77), most of the prehistoric Indian Knoll individuals were likely hunters and collectors with a concentration upon riverine shell fish. The Ft. Ancient and Mississippian sites represent a food producing, horticultural strategy with complex social organization and stratification. RELIABILITY TESTS Both intra- and inter-observer reliability tests were performed. I scored a series of 102 Indian Knoll molar teeth, waited a week, and then rescored the same specimens (ECS-ECS). Since Indian Knoll is a collection showing considerable attrition, this comparison tested the reliability of the method in the upper ranges of the scale. To test the reliability of the method in the lower ranges of the scale, I scored 93 Hardin site teeth, waited seven days, and rescored the same specimens. Inter-observer reliability was tested by myself and a graduate student (ECS-RBT) whom I trained in the scoring procedure. We practiced together for two sessions, looking at a representative sample of teeth from all 10 wear categories. We scored a new series of 88 teeth independently, on different days and in one another’s absence. Means, standard deviations, and correlation coefficients on the pairs of data (ECS x ECS 215 DENTAL WEAR SCORING TECHNIQUE TABLE 2 Reliability tests of methodtintra-obseruer tests (ECS-ECS) Indian Knall X s.d. N r Hardin Village 1st run 2nd run 1st run 2nd run 24.32 7.64 102 24.01 7.54 102 13.94 7.40 93 14.03 7.95 93 0.98 0.96 ANOVA tables Source ss df Measurements Teeth Remainder Total 1 101 101 __ 203 Measurements Teeth Remainder Total 1 92 92 185 Indian Knoll 4.711 10,389.496 1,678.789 12,072.996 MS F 4.711 102.866 16.621 0.283 6.1889** Hardin Village 0.435 1,089.952 332.565 1,422.952 0.435 11.847 3.614 0.120 3.278* TABLE 3 Reliability tests of method:inter-observer tests (ECS-RBTI - X s.d. N r RBT ECS 19.19 7.52 88 19.10 8.37 88 0.98 ANOVA table Source ’ Measurements Teeth Remainder Total df ss MS F 1 87 87 175 0.09 10,889.75 259.410 11.149.25 0.09 125.1695 2.9817 0.0301 41.9792 for the intra-observer test and ECS X RBT for the inter-observer test) were calculated. Because this sort of reliability test is a matched pairs design, either a matched sample t-test or paired comparison type of ANOVA could be performed. As the two methods are equivalent, I present the more informative ANOVA results (Sokal and Rohlf, ’69: p. 330). The results of the intra- and inter-observer tests are presented in tables 2 and 3. Following Jamison and Zegura (‘741, correlation coefficients are presented, although they are not sufficient in themselves to indicate high reliability (in t h e sense of measurement precision). Because the correlation coefficient reflects how closely the two sets of measurements are varying, “a high positive correla- tion between the results of two investigators who measured the same group of subjects could mean either t h a t they obtained essentially the same results or t h a t the values covaried in a systematic fashion” (Jamison and Zegura, ’74: p. 200). I suggest the high correlations shown in t h e intra-observer reliability tests (ECS-ECS) support a conclusion of high precision of measurement. Examination of the close means and variances of the two sets of measurements supports this, as did an examination of the differences between the first week’s scores minus the second week’s scores. All differences (and there were few) appeared random. The F ratios of t h e ANOVAS were likewise insignificant, adding to the confidence that the scoring procedure 2 16 E. C. SCO'IT could be performed with precision by one individual a t two different times. Similarly close means and variances were found in the inter-observer comparisons, but the matched sample t-test and the paired comparisons analyses of variance both rejected the null hypothesis of no difference between the ECS and RBT scores (data not presented). Upon examining the sets of data it became clear that one of the observers (RBT) consistently scored teeth slightly lower than the other observer. The combination of the difference in means of the two observers along with the large sample sizes of these comparisons almost insured findings of significant differences with even small differences between the sets of scores. A correction factor of 1unitkooth (i.e., 1 out of a possible 40) was subsequently added to all of RBT's scores. This constant did not affect the variance, of course, but did affect the mean. This corrected RBTECS ANOVA demonstrated (as did the ECSECS comparisons) no significant differences between the paired observations (table 31.' DISCUSSION Because different parts of the tooth (buccal, lingual, mesial, distal) wear differently, scoring by quadrants allows a more accurate reflection of the amount of enamel present on the occlusal surface than do procedures where the whole tooth is viewed. The variability in the data is thus better reflected with this technique. My experience in scoring teeth leads me to believe that the amount of enamel lost in each successive class is approximately equal, though I would be reluctant to declare an amount or percent of enamel lost. The 4-40 scale presented here approximates an interval scale, in my view, better than other published scales. The principal axis method of analysis, mentioned in the beginning of the article and the topic of the accompanying paper, works best when the scale approximates an interval level. That alone makes the 4-40scale preferable for data collection if one is analyzing the data by the principal axis technique, but there are other advantages as well. The principal axis technique, like other least squares techniques, is sensitive to variance. Confidence intervals are larger when the data exhibit higher variances. Tied ranks (duplicate scores) can inflate the variance and therefore extend the confidence regions of the principal axis slope if such ties occur a t a distance from (rather than close to) the mean. TABLE 4 Comparison of numbers of MlM2pairs in tied ranks scored on 1-40 and 1-8s v s t e m Maxillae No. pairs Mandibles No. pairs i n tied ranks No. pairs No. pairs i n tied ranks Indian Knoll Scott system 29 2 27 3 30 21 26 16 Molnar system Campbell Site Scott system 34 6 36 10 system 34 22 36 21 Molnar Hardin Site Scott system 30 7 18 6 27 21 18 12 Molnar system To test whether the 4-40scale produces data with fewer tied ranks than Molnar's 1-8scale, the same specimens were scored with both methods. Data is presented in table 4. The Scott system clearly produces fewer duplicate scores than the Molnar system, as would be expected. Confidence regions for the data scored with the 4-40scale are narrower than those for data scored with the 1-8scale, as discussed in the accompanying paper. CONCLUSIONS The dental wear scoring method presented here is designed to supplement Molnar's procedure. Whether a researcher would use my method or Molnar's would depend upon the objectives of the research; as the accompany' In addition to demonstrating the high reliability of the technique, the reliability comparisons illustrate some interesting facts about the use of t h e correlation coefficient as a n indicator of reliability. In t h e inter-observer comparisons, t h e high correlation coefficient reflected precision of measurement (both sets of ohaervations were internally reliable) hut a consistent (if correctable) source of error. The correlation coefficient as t h e sole indicator in inter-observer reliability is therefore ambiguous, a9 Jamison and Zegura ('74) pointed out. Nonetheless, it is probably a g d intra-observer reliability measure. The reason for this statement is something of a prohabdity statement itself. If a n observer were making observations consistently, using t h e same criteria on each specimen ("reliably") a high positive correlation coefficient would occur between observations in t h e same set of data measured a t two times. It is not likely if the ohserver is truly consistent i n hidher choice of landmarks or characteristics that a systematic bias would inflate the correlation coefficient; random errors would not. Thus a high positive correlation coefficient is a goad indicator of intra.ohserver comparisons. In either intra- or inter-observation comparisons, a perusal of the data for systematic biases would add confidence to t h e decision. DENTAL WEAR SCORING TECHNIQUE ing article illustrates, my procedure for expressing degree of wear is preferable t o Molnar’s if the objective is a principal axis analysis of dental wear, or by extension, any analysis dealing with rate of wear rather than central tendency expressions of degree. Molnar’s scoring system includes direction of wear and type (shape) of wear as well as degree of wear. Someone interested in these other characteristics of wear could simply substitute the 4-40 system for Molnar’s 1-8 degree of wear variable category. It must be pointed out that this procedure is useful for molars (though possibly modifiable for premolars) and anyone interested in degree of wear studies in which different parts of the dentition are compared would be advised t o utilize the same scale throughout all tooth groups. The wear score categories are quickly learned, and even a relatively inexperienced observer can score a complete dentition in three t o five minutes. Experienced observers can score a dozen or so complete dentitions in about a half hour. ACKNOWLEDGMENTS I thank the Museum of Anthropology of the University of Kentucky and the Anthropology 217 Museum of the University of Missouri for kindly allowing me access to their skeletal collections. A great deal of thanks goes t o Mr. Robert B. Tincher, graduate student in the Anthropology Department at the University of Kentucky, for assistance in the inter-observer reliability analysis. Useful comments by B. R. DeWalt are greatfully acknowledged. LITERATURE CITED Jamison, P. L., and S . L. Zegura 1974 A univariate and multivariate examination of measurement error in anthropometry. Am. J. Phys. Anthrop., 40: 197-204. Molnar, S. 1971 Human tooth wear, tooth function and cultural variability. Am. J. Phys. Anthrop., 34: 175-189. Robbing, L. M. 1977 The Story of Life Revealed by the Dead. In: Biocultural Adaptation in Prehistoric America. Southern Anthropological Society Proceedings, No. 11.R. L. Blakely, ed. University of Georgia Press, Athens. Scott, E. C. 1979 Principal axis analysis of dental attrition data. Am. J. Phys. Anthrop., 51: 203-212. Smith, P. 1972 Diet and attrition in the Natufians. Am. J. Phys. Anthrop., 37: 233-238. 1976 Dental pathology in fossil hominids: What did Neanderthals do with their teeth? Curr. Anthro., 17: 149-151. Sokal, R. R., and R. J. Rohlf 1969 Biometry. Freeman, San Francisco. Walker, P. L. 1978 A quantitative analysis of dental a t trition rates in t h e Santa Barbara Channel Area. Am. J. Phys. Anthrop., 48: 101-106. Wilkinson, R. G. 1972 Comment on Molnar’s “Tooth Wear and Culture.” Curr. Anthro., 13: 521.