Latent variable approach to the measurement of physical disability in rheumatoid arthritis.код для вставкиСкачать
Arthritis & Rheumatism (Arthritis Care & Research) Vol. 51, No. 3, June 15, 2004, pp 399 – 407 DOI 10.1002/art.20404 © 2004, American College of Rheumatology ORIGINAL ARTICLE Latent Variable Approach to the Measurement of Physical Disability in Rheumatoid Arthritis AGUSTÍN ESCALANTE,1 INMACULADA DEL RINCÓN,1 AND JOHN E. CORNELL2 Objective. To measure physical disability in rheumatoid arthritis (RA) using a latent variable derived from a generic and a disease-speciﬁc self-reported disability instrument and an observer-assessed functional status scale. Methods. Consecutive patients with RA completed the modiﬁed Health Assessment Questionnaire (M-HAQ) and the Short Form 36 (SF-36) physical function scale. An observer assigned a Steinbrocker functional classiﬁcation. We used principal component factor analysis to extract a latent variable from the 3 scales. We used the Bayesian Information Criterion to compare how well the new latent variable and the 3 primary scales ﬁt the criterion standards of current work status; vital status at 6 years; grip strength; walking velocity; the timed-button test; pain; and joint tenderness, swelling, and deformity. Results. Complete data were available for 776 RA patients. The extracted latent variable explained 75% of the variance in the 3 primary scales. On a scale of 0 –100, higher scores representing less disability, its mean ⴞ SD was 56.4 ⴞ 22.5. Correlation between the latent variable and the M-HAQ was ⴚ0.87; between the latent variable and SF-36 physical function scale was 0.89, and between the latent variable and Steinbrocker class was ⴚ0.85. Multivariate models that included the latent variable had superior ﬁt than did models containing the primary scales for the criteria of current working; death by 6 years; pain; joint tenderness, swelling, or deformity; grip strength; walking velocity; and timed button test. Conclusion. A latent variable derived from the M-HAQ, the SF-36 physical function scale, and the Steinbrocker functional class provides a parsimonious scale to measure physical disability in RA. The ﬁt of the latent variable to comparison standards is equivalent or superior to that of the primary scales. KEY WORDS. Physical disability; Rheumatoid arthritis; Disease-speciﬁc health measures; Generic health measures; Outcome assessment; Factor analysis. INTRODUCTION The rheumatic disease process frequently leads to physical disability (1). Researchers aiming for a better understanding of rheumatoid arthritis (RA) outcome must ﬁrst quantify it in a meaningful and reliable way. However, physical disability is a hypothetical construct, i.e., it was put to- Supported by an Arthritis Investigator Award and a Clinical Science Grant from the Arthritis Foundation; NIH grants RO1-HD37151, K23-HL004481, K24-AR47530; and grant M01-RR01346 for the Frederic C. Bartter General Clinical Research Center. 1 Agustı́n Escalante, MD, Inmaculada del Rincón, MD: The University of Texas Health Science Center at San Antonio; 2 John E. Cornell, PhD: The University of Texas Health Science Center at San Antonio and the South Texas Veterans Administration Health System, San Antonio, Texas. Address correspondence to Agustı́n Escalante, MD, University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Drive, San Antonio, Texas 78229-3900. Submitted for publication October 5, 2002; accepted in revised form August 1, 2003. gether by scientists to explain the decline in the ability to perform physical activities that can occur in RA and other diseases (2). This implies that physical disability cannot be directly observed or measured, and as such, it can be considered a latent variable (3). Available measurement tools to assess physical disability in RA indirectly tap into the underlying construct (3). To measure disability in RA, researchers have a variety of instruments and scales from which to choose (4). Some of these are considered arthritis-speciﬁc because they center on outcomes more immediately relevant to arthritis (5–7). Generic scales, on the other hand, measure more global outcomes and are suitable for studying a diversity of diseases (8). Each has its own set of advantages (9 –12). Some empirical studies, however, have not found major differences in performance between the 2 types of scales (13,14). The choice between one type over the other not being clear cut, some authorities reasonably advocate including both an arthritis-speciﬁc and a generic outcome measure in RA trials (9,15). This recommendation has the added beneﬁt that 2 or more measurement tools will pro399 400 vide a more reliable representation of the underlying construct (16). However, not much attention has been given to how to report the results of studies that include 2 or more outcome measures of the same construct. The option of describing results on both scales separately in the same, or different, reports has certain disadvantages, including the need to conduct separate, parallel analyses; a greater potential for type I errors due to multiple comparisons; the added space needed to show results fully; enticement to duplicate publication; and problems of interpretation if results on the different scales diverge. When the last of these occurs, investigators may be lured into simply omitting results on the scale that do not ﬁt their hypotheses. These potential problems, theoretical or real, can be averted by a data-reduction process aimed at estimating the underlying latent variable, conserving or enhancing information provided by the various scales. We have confronted some of the above dilemmas during an ongoing study of the disablement process in RA. We selected 2 self-report scales, one generic and the other disease-speciﬁc, and an observer-derived classiﬁcation system to assess the extent of physical disability in RA. In this article, we describe the data-reduction process we utilized to derive a parsimonious, single variable representing the construct of physical disability. We also show evidence of its equivalence, or superiority, to the 3 primary scales. PATIENTS AND METHODS Patients. From 1996 to 2000 we enrolled patients meeting the 1987 American College of Rheumatology (formerly American Rheumatism Association) RA criteria (17) into a study of the disablement process in RA (18). We have described our sample in previous publications (18 –21). The study’s acronym, ÓRALE (Outcome of Rheumatoid Arthritis Longitudinal Evaluation), matches a Mexican American idiom for “Lets go!” Here we will show crosssectional results obtained during the recruitment evaluation of each participant. Data collection procedures. Our study was approved by the Institutional Review Board of each of the recruitment centers and all patients gave written, informed consent. A physician or a research nurse, assisted by a trained research associate, conducted evaluations at the clinic where the patient was recruited. The evaluation lasted ⬃90 minutes and consisted of a comprehensive interview, physical examination, review of available medical records, laboratory tests, and radiographs. Interviews were conducted in either English or Spanish, as preferred by patients. Demographics. We ascertained age, sex, and race/ethnicity by self report, as described previously (20). Musculoskeletal examination. A physician or research nurse, trained in joint examination techniques, assessed Escalante et al 48 joints in each patient for the presence or absence of tenderness or pain on motion, swelling, or deformity, as described elsewhere (22). Pain. We asked patients to rate the amount of pain they experienced due to their arthritis during the past week on a graded, horizontal 10-point scale that has been validated in our patient population (23). Performance-based functional measures. We measured grip strength with a hand-held JAMAR dynamometer (Sammons Preston, Bolingbrook, IL). In a sitting position, with the elbow held at 90° and the forearm supported on a ﬂat horizontal surface, patients were asked to squeeze the handle with as much as strength as possible. Three repetitions from each hand were recorded in kilograms. The mean value of all repetitions for both hands is shown. Walking velocity was measured with patients starting in a standing position. They were asked to walk at their usual pace for a distance of 50 feet, or 25 feet if they had difﬁculty covering the full distance. No effort was made to conceal the stopwatch used to time the patients. Results are expressed in feet per second. Patients unable to walk were assigned a velocity of 0 feet per second. Patients were timed as they donned and fastened the front buttons in a standard 8-button shirt (Wal-Mart, San Antonio, TX). Results are expressed as buttons per second. Patient unable to don the shirt were assigned a value of 0 buttons per second. Physical disability measures. We used 3 instruments to measure physical disability. The disability index of the modiﬁed Health Assessment Questionnaire (M-HAQ) is a self-administered, arthritis-speciﬁc instrument that asks respondents to rate the amount of difﬁculty they have performing 8 activities (dressing, getting out of bed, lifting a cup, walking, bathing, bending, turning faucets, and getting in and out of a car) on a scale ranging from 1 to 4 (without difﬁculty, with some, with much, and unable) (24). We used a cross-culturally equivalent Spanish version for our Spanish-speaking patients (23). The Short Form 36 (SF-36) physical functioning scale (SF-36PF) is an interviewer-administered generic instrument (8). The SF36PF asks respondents to rate the amount of limitation caused by health on 10 physical activities (vigorous activities; moderate activities; carrying groceries; climbing several or 1 ﬂight of stairs; bending; kneeling or stooping; walking more than a mile; walking several blocks or 1 block; bathing; and dressing). Respondents rate each activity on a 3-level scale (a lot of difﬁculty, a little, no difﬁculty). Individual responses were summed, and the sum was rescaled to range. The Steinbrocker functional classiﬁcation was used by the physician or research nurse, who were trained in physical function assessment, to rate the extent of physical disability on a 4-level scale, ranging from class I, “complete functional capacity to carry out all usual duties without handicaps,” to class IV, “largely or wholly incapacitated with (the person) bedridden or conﬁned to wheelchair” (25). We used each of these 3 scales as Latent Variable Measuring Disability in RA intended when they were originally developed, scoring them as recommended by their original authors. Work status. We asked patients to describe their current work status from among the following answers: working full or part time, retired, student, housewife, unemployed/ laid off, or disabled/unable to work. We used these responses for 2 sets of analyses: For the ﬁrst, we classiﬁed patients as working (full or part time) versus not working (all others); for the second, we classiﬁed patients as disabled/unable to work versus all others. Vital status. We have recontacted the patients at yearly intervals since their initial evaluation. For patients with whom we were not able to establish contact, even through family members, we searched publicly available death registries. We obtained a death certiﬁcate for all patients who died. Statistical analysis. We performed a principal component factor analysis using the composite summary scores of the M-HAQ, SF-36PF, and the Steinbrocker functional class, and then extracted the ﬁrst principal component from the unrotated factor loadings using the least squares regression method (26). We rescaled the extracted factor to range from 0 to 100 with a positive valence, higher values representing less disability. To evaluate the degree of bivariate association between the new latent variable and other study variables with interval or ratio distributions, we used Pearson product moment correlation coefﬁcients (27). For the Steinbrocker functional class, a 4-level ordinal scale, we used the square root of the multiple R2 from a regression model that included dummy variables for each Steinbrocker level instead of the Pearson coefﬁcient. Differences between the coefﬁcients were tested after Fisher z-transformation (28) using the procedure provided by Goldstein (29). Because this required us to perform a total of 21 correlation coefﬁcient comparisons, we only considered coefﬁcients to be signiﬁcantly different if the comparison P value was ⱕ 0.002, adjusted according to the Bonferroni technique (the conventional ␣ ⫽ 0.05 ⫼ number of comparisons ⫽ 21). To evaluate the latent variable’s association with categorical criterion variables, we divided the latent variable into ordinal categories and used chi-square to test the strength of association (27). We then evaluated the ﬁt of multivariate models that included the new latent variable compared with models that included the primary variables. We asked the question: Does a multivariate model that includes the new latent variable ﬁt the criterion standards better than models that include any of the primary variables? We included age and sex as covariates in all these multivariate models because they can have a strong inﬂuence on any of the criterion measures we used. The general form of the models we compared was y ⫽ a ⫹ b ⫹ pd where y could be any of the criterion standards (working status, vital status, grip strength, etc.), a was age, b was sex, and pd was 1 of the 4 physical disability scales (M-HAQ, SF-36PF, Steinbrocker class, or the new latent variable). 401 When y was a categorical variable, the model was a logistic regression; and when y was an interval or ratio variable, the model was ordinary least squares regression. We expected that the ﬁt of a multivariate model including the new latent variable on any of the criterion standards would be equivalent or superior to the ﬁt of models that include any of the 3 primary variables. We used the Bayesian Information Criterion (BIC) to conﬁrm this expectation (30). The BIC varies inversely with a model’s ﬁt, and given 2 models, the one with the smaller or more negative BIC has better ﬁt (30). We used Raftery’s guidelines to interpret BIC differences between 2 models: A BIC difference ⬎10 is considered “very strong” evidence in favor to the model with the smaller BIC; a difference of 6 –10 is “strong;” 2– 6 is “positive,” and 0 –2 is “weak” evidence (30). We performed all analyses on a desktop personal computer, using the Stata 7.0 software package (College Station, TX). RESULTS As expected for a group of people with established RA visiting a rheumatologist, most were women, median disease duration was 8 years, and rheumatoid factor was present in the majority (Table 1). The median number of 8 deformed joints indicates a substantial amount of joint damage (22). In accord with this ﬁnding, only 21% of the patients were working full or part time, and 27% stated they were unable to work. Of the 756 patients on whom we had followup information up to 6 years later, 71 were known to have died (9%). Figure 1 is a diagram of the factor analysis we used to derive the physical disability latent variable. The 3 primary variables, M-HAQ, SF-36PF, and Steinbrocker class, loaded strongly on a single factor, with loadings ⱖ0.8. This factor explained ⱖ75% of the primary variables’ combined variance. Uniqueness values were ⬍0.3 for each of the primary variables, indicating that these share more than two-thirds of their combined variance. We extracted the single factor without rotation, using linear regression scoring. Figure 2 shows probability distributions for the 3 primary scales and the latent variable. The Pearson correlation coefﬁcients between the extracted latent variable and the primary variables, as expected, was also strong, with r values ⱖ0.8. Figure 3 shows scatterplots of the bivariate distribution of these variables. The correlations between the latent variable and the criterion variables (pain, joint tenderness, swelling or deformity, grip strength, walking velocity, and the timed button test) are shown in Table 2, contrasted with the correlation coefﬁcients between the primary scales and the same criterion standards. The latent variable had a signiﬁcantly stronger correlation with most of the criterion standards than did the primary variables M-HAQ, SF-36PF, and Steinbrocker class. Notable exceptions were the correlation with the pain and articular examination variables, for which there was no signiﬁcant difference between the M-HAQ and the latent variable. Also interestingly, the number of deformed joints correlated more strongly with Steinbrocker class that with any of the other physical disability scales. 402 Escalante et al Table 1. Clinical characteristics of the 776 RA patients studied* Characteristic Age, median (range), years Male, no. (%) Ethnic group, no. (%) White Black Asian Hispanic Other Education, median (range), years Currently working, no. (%) Disabled for work, no. (%) Time from disease onset, median (range), years Tender joint count, no. (%) Swollen joint count, no. (%) Deformed joint count, no. (%) Nodules, no. (%) Rheumatoid factor positive, no. (%) Walking velocity, mean ⫾ SD, meters/minute Grip strength, mean ⫾ SD, lbs Button test, mean ⫾ SD, buttons/minute MHAQ, mean ⫾ SD SF-36, mean ⫾ SD Steinbrocker functional class, mean ⫾ SD I II III IV Latent Disability Scale, mean ⫾ SD, lbs Deaths as of March 2002, no. (%) No. with data available 776 776 776 772 776 776 776 776 776 776 776 770 775 776 769 776 776 776 776 756 Distribution 57 (19–90) 229 (30) 272 (35) 53 (7) 14 (2) 431 (56) 6 (1) 12 (0–17) 166 (21) 213 (27) 8 (0–52) 15 (13) 7 (7) 10 (11) 233 (30) 682 (89) 59 ⫾ 25 14 ⫾ 10 7.1 ⫾ 3.8 1.89 ⫾ 0.70 35.6 ⫾ 27.87 163 ⫾ 21 383 ⫾ 49 190 ⫾ 24 40 ⫾ 5 56 ⫾ 23 71 (9) * RA ⫽ rheumatoid arthritis; MHAQ ⫽ Modiﬁed Health Assessment Questionnaire; SF-36 ⫽ Short Form 36. Figure 4 shows the relationship between the latent variable and selected comparison criteria. These graphs show the association between higher values in the latent variable and graded decreases in the number of deformed joints, the proportion of disabled patients, and the proportion of those who died within 6 years. Conversely, performance-based functional measures (grip strength, timed button test, and walking velocity) displayed a proportional rise with increasing values on the latent scale, as did the probability of working full or part time. Table 3 shows the BICs of models that contained age, sex, and each of the 4 disability scales (the M-HAQ, SF36PF, Steinbrocker class, and the latent variable) as independent variables for each of the criterion standards. For most of the criterion standards, the BIC was smaller, indicating better ﬁt, in the models that included the latent variable (Table 3). Notable exceptions, again, included the Steinbrocker class, whose model had a better ﬁt versus the deformed joint count than did any of the other physical disability scales. Likewise, there was positive evidence that the SF-36PF ﬁt better in a model for disabled work status, than did any of the other physical disability scales. DISCUSSION Figure 1. Diagram of the factor analysis we conducted to extract the latent variable measuring physical disability. The 3 primary variables are represented by squares and the circles represent information outside the latent variable. M.H.A.Q. ⫽ modiﬁed Health Assessment Questionnaire; SF-36 ⫽ Short Form 36. One desirable characteristic of research data is parsimony, or simplicity of explanation (31). Under this principle, one variable is preferable to 2 or more, providing that the single variable is as informative as the 2 or more. We have shown evidence that a single latent variable derived from principal component factor analysis of 3 scales, the MHAQ, the SF-36PF, and the Steinbrocker functional class, has equal or superior performance to the primary scales, as Latent Variable Measuring Disability in RA 403 Figure 2. Frequency distributions of the disability scales employed. Disability level decreases from left to right. A large proportion of patients had a score of 1 on the modiﬁed Health Assessment Questionnaire (M.H.A.Q.), indicating low disability levels on this scale (top left). However, the opposite is true for the Short Form 36 physical function (SF36PF) scale, in which the largest category is made up of patients with low scores, indicating high disability levels (top right). The Steinbrocker functional class provides only 4 levels to classify physical disability (lower left). The distribution of scores on the latent disability scale approached normality (lower right). manifested by an equal or stronger degree of association with the criterion standards we selected. We used the disablement process as a theoretical framework to inform our selection of criterion standards (18,32,33), aiming to test the underlying physical disability construct from as many perspectives as possible. Thus, our comparison criteria included key RA impairments, such as the amount of pain and the number of tender, swollen, and deformed joints (33). We also used measures of functional limitation, occupational status, and death within 6 years as criteria. The correlation between the joint impairments and the latent variable was nearly always stronger than that between the same impairments and the primary disability scales (Table 2). This likely is due to the superior reliability of the latent variable, which is a composite of the 3 primary disability scales. This approach has been referred to as incomplete principal component regression because the variable of interest is provided by the ﬁrst principal component in a factor analysis (34). The composite measure’s stronger correlation with most criterion standards conforms to a fundamental theorem of measurement theory Corr(x,y) ⱕ 冑rel共 x 兲 ⫻ rel共 y兲 according to which the correlation between 2 variables, x and y is limited by the square root of the product of each variable’s reliability (16). However, there were 2 comparisons that did not follow this rule: The M-HAQ correlated equally strongly as the latent variable with the impairments; and the Steinbrocker class correlated more strongly with the number of deformities than did any of the other disability scales, including the latent variable. The reason for this may be that examiners may have incorporated ﬁndings from the joint exam into their judgment of the Steinbrocker class. In contrast, the M-HAQ and the SF-36 are self-reported scales that patients answer according their own perceived condition. We also used 3 performance-based measures of functional limitation: grip strength, walking velocity, and timed button test. Within the disablement process framework, these measurements are closer to the physical disability construct than are the joint impairments (18,32,33) and, consequently, their degree of correlation with the disability scales was stronger. Here, even more so than with the impairments, the latent variable’s association with the performance-based measures was stronger than that of any of the 3 primary physical disability scales considered individually. Work loss is one of the main adverse consequences of RA (35). We found that work status was strongly associated with the 4 physical disability scales with a tendency for the association to be stronger for the latent variable. Likewise, death displayed a similar pattern of association. One of the main uses of these comparison standards, occupational and vital status, is as anchors that researchers or clinicians can use to interpret the values along the latent variable scale. As shown in Figure 4, there are strong adverse outcomes associated with lower values for the latent variable. The physical disability scales we used in the present 404 Escalante et al Figure 3. Matrix plot showing the bivariate distribution of the 3 primary variables and the latent variable. The Pearson correlation coefﬁcient between the latent variable and the modiﬁed Health Assessment Questionnaire was ⫺0.87; between the latent variable and Short Form 36 physical function scale (SF-36PF) was 0.89; and between the latent variable and the Steinbrocker class was ⫺0.85. All coefﬁcients were signiﬁcant at P ⱕ 0.0001. analyses, including the latent variable we developed, often ﬁnd use in multivariate models, either as outcomes or predictors. We thus compared the ﬁt of models that included the different physical disability scales as independent variables and each of the different criterion variables as outcomes. Because each of the criteria we used can be heavily inﬂuenced by age and sex, we included these 2 variables as covariates in all of the multivariate models. We chose the BIC as a comparative measure because it is a tool used often for model selection (30,36). We expected that the models that included the new latent variable Table 2. Correlation between physical disability scales and variables measured as criterion standards* Pain Tender Swollen Deformity Grip Velocity Button MHAQ SF-36PF Steinbrocker class Latent variable 0.59 0.49 0.24 0.20† ⫺0.54† ⫺0.61† ⫺0.54† ⫺0.53† ⫺0.43 ⫺0.22 ⫺0.25† 0.52† 0.65† 0.55† 0.41† 0.33† 0.22 0.52† 0.48† 0.67† 0.60 ⫺0.59 ⫺0.47 ⫺0.24 ⫺0.35 0.59 0.72 0.64 * Pearson correlation coefﬁcients were compared after Fischer ztransformation, after equalizing coefﬁcient signs (28,29). MHAQ ⫽ Modiﬁed Health Assessment Questionnaire; SF-36PF ⫽ short form 36 physical functioning scale. † Signiﬁcance of comparisons versus latent variable was set at P ⱕ 0.002. would have smaller BICs, indicating better ﬁt. Indeed this was the case with nearly all of the criterion variables. The latent disability variable has distributional advantages over the 3 primary scales. Both the M-HAQ and the SF-36PF display skewed distributions (Figure 2), the former displaying a ceiling effect, the latter, a ﬂoor effect (37). The latent scale lacks skewness in either direction, more closely approximating normality than any of the primary scales. Moreover, the latent variable displays an interval or near-interval distribution, as suggested by the monotonic rise in criterion variables as the scale increases (Figure 4). The latent variable has theoretical advantages as well: Physical disability is a hypothetical construct and claims that any one disability measurement scale is superior to others are debatable. Using more than one measurement tool may be a more accurate way to get at the underlying construct because it enables the unmeasured construct to be assessed from a variety of angles. For this same reason, the idea of using both a scale intended speciﬁcally for arthritis and one intended for unselected populations (9,15) is quite attractive, because the arthritis-speciﬁc scale, the M-HAQ in our study, will capture the arthritisrelevant outcomes whereas the generic scale, here provided by the SF-36PF, will capture an overall nonspeciﬁc disease impact. We acknowledge some limitations of our analysis. Factor analysis assumes that data are distributed on interval, multivariate normal scales, an assumption that may not be stringently met by the 3 disability scales we entered into Latent Variable Measuring Disability in RA 405 Figure 4. Relationship between the latent variable measuring physical disability and the criterion measures deformed joint count (top left; trend P ⱕ 0.001); walking velocity, grip strength, and timed button test (top right; trend P ⱕ 0.001 for each variable); work disability and death within 6 years (bottom left; trend P ⱕ 0.001 for each); and currently working (bottom right; trend P ⱕ 0.001). Error bars represent standard error. the factor analysis. However, this assumption is a strict requirement only if statistical inference is used to determine the number of factors and can be relaxed when factor analysis is used descriptively (26,38). The least squares factor extraction method we used is also robust to devia- tions from normality (39). The M-HAQ and SF-36PF scales we used were developed using sound psychometric theory to produce results on interval or near-interval scales, and they have each been used as such in numerous studies over many years. We used the composite scores of both Table 3. Bayesian information criterion of multivariate models, according to physical disability scale used as independent variable* Physical disability scale included as independent variable in multivariate model† Dependent variable MHAQ SF-36PF Steinbrocker class Latent variable‡ Currently working Currently disabled Death within 6 years Pain Tender joint count Swollen joint count Deformed joint count Grip strength Walking velocity Timed button test ⫺4,429§ ⫺4,288§ ⫺4,782# ⫺1,588§ 809# ⫺8¶ 662§ 248§ 1,654§ ⫺7,501§ ⫺4,447¶ ⫺4,308 ⫺4,780§ ⫺1,531§ 855§ ⫺5¶ 661§ 261§ 1,609§ ⫺7,465§ ⫺4,432§ ⫺4,266§ ⫺4,775§ ⫺1,414§ 933§ 7§ 519 306§ 1,660§ ⫺7,478§ ⫺4,451 ⫺4,303 ⫺4,791 ⫺1,602 818 ⫺14 607 170 1,466 ⫺7,597 * Values shown are Bayesian information criteria. MHAQ ⫽ Modiﬁed Health Assessment Questionnaire; SF-36PF ⫽ Short Form 36 physical functioning scale. † Model’s form was y ⫽ age ⫹ sex ⫹ physical disability scale, where y ⫽ dependent variable. For current working, currently disabled, and death by 6 years, the model was logistic; for other variables, model was ordinary least squares. ‡ Extracted from a principal component factor analysis of MHAQ, SF-36PF and Steinbrocker class (Figure 1). § Very strong support for model that includes the latent variable. ¶ Positive support for model that includes the latent variable. # Strong support for model that includes the latent variable. 406 these scales, scored as originally intended. It is possible, however, to select items from each of these scales and calibrate their weights so that they more closely approximate a true interval or ratio scale by using item response theory or Rasch analysis (40,41). This may represent an alternative method to accomplish the aims we pursued here. Data parsimony is a desirable feature in a research study; among other reasons, because it avoids the problems we mentioned in the beginning of this article. In the present analysis, we have reduced the original 3 scales into 1 single variable that in many respects outperforms the individual primary scales. A similar data reduction strategy could be used for other RA processes, such as inﬂammatory disease activity, disease damage, joint impairment, and functional limitation (32,33). For example, a latent variable extracted from the disease activity measures recommended for RA clinical trials (42) could potentially lead to more efﬁcient trials if the latent variable outperforms the primary scales, as was the case for disability measures in the present analysis. It is important to point out that ours is a data-driven approach, and that the latent variable cannot be fully speciﬁed as an outcome measure in advance of a study. We do not advise investigators to attempt to directly apply the factor loadings we estimated here to develop a latent disability variable for use in their own studies, because data from another patient sample could be quite different. Moreover, investigators may have reasons to choose a different set of primary disability scales from those used here. We do believe, however, that researchers can apply a principal component factor analysis, similar to that shown here to their own data, to extract a latent variable that will likely exceed the primary scales in reliability. In conclusion, we have used factor analysis to derive a latent variable that measures physical disability in RA. The new variable outperforms the primary scales in a number of tests of association with comparison criterion standards. This approach may be used to develop latent variables measuring other RA disease components, such as disease activity, damage, and functional limitation. ACKNOWLEDGMENT We acknowledge the invaluable assistance of Florencia Salazar and Samvel Pogosian, MD, in the conduct of the ÓRALE study. We also thank Drs. Ramon Arroyo, Daniel Battafarano, Rita Cuevas, Alex de Jesus, Michael Fischbach, John Huff, Rodolfo Molina, Mathew Mosbacker, Frederick Murphy, Carlos Orces, Christopher Parker, Thomas Rennie, Jon Russell, Joel Rutstein, and James Wild for giving us permission to study their patients and for contributing to this study. REFERENCES 1. Guillemin F. Functional disability and quality of life assessment in clinical practice. Rheumatology 2000;39 Suppl 1:17– 23. 2. Wolfe F. The determination and measurement of functional disability in rheumatoid arthritis. Arthritis Research 2002;4 Suppl 2:S11–5. Escalante et al 3. Bollen KA. Latent variables in psychology and the social sciences. Annu Rev Psychol 2002;53:605–34. 4. Liang MH. Self-reported measures of pain, function and health status. In: Gall EP, Gibofsky A, editors. Rheumatoid arthritis: clinical tools for outcome assessment. Atlanta (GA): Arthritis Foundation; 1994. p. 35– 41. 5. Meenan RF, Gertmen PM, Mason JH. Measuring health status in arthritis: the Arthritis Impact Measurement Scales. Arthritis Rheum 1980;23:146 –52. 6. Fries JF, Spitz P, Kraines RG, Holman HR. Measurement of patients outcome in arthritis. Arthritis Rheum 1980;23:137– 45. 7. Tugwell P, Bombardier C, Buchanan WW, Goldsmith CH, Grace E. The MACTAR questionnaire: an individualized functional priority approach for assessing improvement in physical disability in clinical trials in rheumatoid arthritis. J Rheumatol 1987;14:446 –51. 8. Ware JE. SF-36 health survey: manual and interpretation guide. Boston: Nimrod Press; 1993 p. 321–2. 9. Hawker G, Melﬁ C, Paul J, Green R, Bombardier C. Comparison of a generic (SF-36) and a disease speciﬁc (WOMAC) instrument in the measurement of outcomes after knee replacement surgery. J Rheumatol 1995;22:1193– 6. 10. Patrick DL, Deyo RA. Generic and disease-speciﬁc measures in assessing health status and quality of life. Med Care 1989; 27(3 Suppl):S217–32. 11. Ortiz Z, Shea B, Garcia Dieguez M, Boers M, Tugwell P, Bonen A, et al. The responsiveness of generic quality of life instruments in rheumatic diseases: a systematic review of randomized controlled trials. J Rheumatol 1999;26:210 – 6. 12. Wells G, Boers M, Shea B, Tugwell P, Wethovens R, SuarezAlmazor M, et al. Sensitivity to change of generic quality of life instruments in patients with rheumatoid arthritis: preliminary ﬁndings in the generic health OMERACT study. J Rheumatol 1999;26:217–21. 13. Fries JF, Ramey DR. “Arthritis speciﬁc” global health analog scales assess “generic” health related quality of life in patients with rheumatoid arthritis. J Rheumatol 1997;24:1697–702. 14. Hagen BH, Smestad LM, Uhlig T, Kvien TK. The responsiveness of health status measures in patients with rheumatoid arthritis: comparison of disease-speciﬁc and generic instruments. J Rheumatol 1999;26:1474 – 80. 15. Scott DL, Garrood T. Quality of life measures: use and abuse. Baillieres Best Pract Res Clin Rheumatol 2000;14:663– 87. 16. Crocker L, Algina J. Introduction to classical and modern test theory. New York: Holt, Rinehart & Winston; 1986. 17. Arnett FC, Edworthy SM, Bloch DA, McShane DJ, Fries JF, Cooper NS, et al. The American Rheumatism Association 1987 revised criteria for the classiﬁcation of rheumatoid arthritis. Arthritis Rheum 1988;31:315–24. 18. Escalante A, del Rincón I. How much disability in rheumatoid arthritis is explained by rheumatoid arthritis? Arthritis Rheum 1999;42:1712–21. 19. Escalante A. What do self-administered joint counts tell us about patients with rheumatoid arthritis? Arthritis Care Res 1998;11:280 –90. 20. Escalante A, del Rincón I, Mulrow CD. Symptoms of depression and psychological distress in Hispanics with rheumatoid arthritis. Arthritis Care Res 2000;13:156 – 67. 21. Del Rincón I, Battafarano DF, Arroyo RA, Murphy FT, Escalante A. Heterogeneity between men and women in the inﬂuence of the HLA-DRB1 shared epitope on the clinical expression of rheumatoid arthritis. Arthritis Rheum 2002;46: 1480 – 8. 22. Orces CH, del Rincón I, Abel MP, Escalante A. The number of deformed joints as a surrogate measure of damage in rheumatoid arthritis. Arthritis Rheum 2002;47:67–72. 23. Escalante A, Galarza-Delgado D, Beardmore TD, Baethge BA, Esquivel-Valerio J, Marines AL, et al. Cross-cultural adaptation of a brief outcome questionnaire for Spanish-speaking arthritis patients. Arthritis Rheum 1996;39:93–100. 24. Pincus T, Summey JA, Soraci SA Jr, Wallston KA, Hummon NP. Assessment of patient satisfaction in activities of daily Latent Variable Measuring Disability in RA 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. living using a modiﬁed Stanford Health Assessment Questionnaire. Arthritis Rheum 1983;26:1346 –53. Steinbrocker O, Traeger CH, Batterman RC. Therapeutic criteria for rheumatoid arthritis. JAMA 1949;140:659 – 66. Norman GR, Streiner DL. Principal component and factor analysis. In: Norman GR, Streiner DL. Biostatistics: the bare essentials. Hamilton (ON): BC Decker; 2000. p. 163–77. Daly LE, Bourke GJ, McGilvreay J. Interpretation and uses of medical statistics. Oxford (UK): Blackwell Scientiﬁc; 1991. Meng X-L, Rosenthal R, Rubin DB. Comparing correlated correlation coefﬁcients. Psychological Bull 1992;111:172–5. Goldstein R. Testing dependent correlation coefﬁcients. Stata Tech Bull Reprints (STB32) 1997;6:128 –9. Raftery AE. Bayesian model selection in social research. In: Marsden PV, editor. Sociological methodology. Cambridge (MA): Blackwell;1995. p. 111–95. Dixon JK. Grouping techniques. In: Munro BH, editor. Statistical methods in health care research. Philadelphia: Lippincott; 1997. p. 310 – 41. Verbrugge LM, Jette AM. The disablement process. Soc Sci Med 1994;38:1–14. Escalante A, del Rincón I. The disablement process in rheumatoid arthritis. Arthritis Rheum 2002;47:333– 42. Harrell FE Jr. Regression modeling strategies with applications to linear models, logistic regression, and survival analysis. New York: Springer-Verlag; 2001. 407 35. Yelin E, Henke C, Epstein W. The work dynamics of the person with rheumatoid arthritis. Arthritis Rheum 1987;30: 507–12. 36. Zucchini W. An introduction to model selection. J Math Psychol 2000;44:41– 61. 37. Stucki G, Stucki S, Brühlmann P, Michel BA. Ceiling effects of the health assessment questionnaire and its modiﬁed version in some ambulatory rheumatoid arthritis patients. Ann Rheum Dis 1995;54:461–5. 38. Tabachnick BG, Fidell LS. Principal component and factor analysis. In: Using multivariate statistics. Needlam Heights (MA): Allyn & Bacon; 2001. p. 582– 652. 39. Floyd FJ, Widaman KF. Factor analysis in the development and reﬁnement of clinical assessment instruments. Psychol Assess 1995;7:286 –99. 40. Tennant A, Hillman M, Fear J, Pickering A, Chamberlain MA. Are we making the most of the Stanford Health Assessment Questionnaire? Br J Rheumatol 1996;35:574 – 8. 41. Wolfe F. Which HAQ is best: a comparison of the HAQ, MHAQ, and RA-HAQ, a difﬁcult 8-item HAQ (DHAQ), and a rescored 20 item HAQ (HAQ20): analyses in 2491 RA patients following leﬂunomide initiation. J Rheumatol 2001;28:982–9. 42. Felson DT, Anderson JJ, Boers M, Bombardier C, Furst D, Goldsmith C, et al. American College of Rheumatology preliminary deﬁnition of improvement in rheumatoid arthritis. Arthritis Rheum 1995;38:727–35.