Measurement of functional status quality of life and utility in rheumatoid arthritis.код для вставкиСкачать
59 1 REVIEW MEASUREMENT OF FUNCTIONAL STATUS, QUALITY OF LIFE, AND UTILITY IN RHEUMATOID ARTHRITIS MARY J. BELL, CLAIRE BOMBARDIER, and PETER TUGWELL Rheumatoid arthritis (RA), a chronic inflammatory disease of unknown etiology, has no known cure, and may run a protracted course. Traditional epidemiologic measures of disease outcome, such as death or resolution of illness, reflect only the extremes of outcomes of RA. Recognition of the impact of this chronic disease on the individual has led to the development of quality of life instruments, which measure more relevant health outcomes such as physical, mental, and social well-being and reflect the management goals of preservation of function and limitation of symptoms (1). In addition, these instruments track changes in health status over time, prognostically stratify patients, and assist in the targeting of interventions. More recently, utility measures have been used to combine the risks and benefits of an intervention into a single measure. We review here some representative instruments that have been used in assessments of patients with RA. The format of this review will follow the clinical questions format outlined in Table 1. From the Department of Clinical Epidemiology and Biostatistics and the Department of Medicine, McMaster University, Hamilton, Ontario; and the Department of Medicine and Health Administration, University of Toronto, Toronto, Ontario, Canada. Mary J. Bell, MD, FRCPC: Assistant Professor, Department of Clinical Epidemiology and Biostatistics and Department of Medicine, McMaster University; Claire Bombardier, MD, FRCPC: Associate Professor, Department of Medicine and Health Administration, University of Toronto; Peter Tugwell, MD, MSc, FRCPC: Professor, Department of Clinical Epidemiology and Biostatistics and Department of Medicine, McMaster University. Address reprint requests to Mary J. Bell, MD, Suite 804,25 Charlton Avenue East, Hamilton, Ontario L8N 1Y2, Canada. Submitted for publication December 28, 1988; accepted in revised form October 23, 1989. Arthritis and Rhenmatisrn, Vol. 33, No. 4 (April 1990) Is the measurement of quality of life something new? Interest in the development of a functional index began in the 1930s. The knowledge that physiologic improvements were not always reflected by increased functional ability resulted in the creation of a few simple functional scales, such as the American Rheumatism Association (ARA) functional capacity classification, the New Y ork Heart Association scale, and the Karnovsky Index. These were designed primarily for broad application and rapid classification of a patient population. Although they have been useful in detecting a large magnitude of change in an individual patient, they have not been as useful in detecting the small but clinically important improvements that result from most therapeutic interventions used in rheumatology . Concern about quality of medical care in the 1960s led to a more comprehensive evaluation of the impact of clinical services. Scales were devised that incorporated longevity, symptoms, presence of physiologic abnormalities, functional status, psychosocial factors, compliance, and satisfaction. Driven by the need for program evaluation and policy making, wider applicability of these measures was sought. “General health status” instruments that could be used in community, general patient, and specific disease populations were developed. Both clinical rehabilitation and health service research in the 1980s further advanced the science of quality of life and utility measurement. These tools allow one to measure the end product of the health care system as it concerns the patient. From this base, BELL ET AL Table 1. Clinician’s questions Is the measurement of quality of life something new? Why do I need to know about these measures? What is available? How could I use them? How well do these instruments perform when their measurement characteristics are tested? 6. How should I evaluate any new quality of life measure? 7. Is the state of the art sufficiently advanced that drug approval agencies should require quality of life data before approval? 8. What will the future bring in the way of further development of these measures? 1. 2. 3. 4. 5. more comprehensive, disease-specific health status and utility measures have been created (2). W h y do I need to know about these measures? Functional status, quality of life, and utility measures have developed out of a need to define to a greater extent the effects of health care interventions and the impact of disease on the individual as well as groups of individuals with chronic disease. These instruments have been shown to be conceptually relevant, reliable, valid, and as responsive to change as traditional outcome measures. Prospective evaluation of the patient population by use of these measures has demonstrated that they may also have predictive value for determining eventual outcome. Sherrer et al studied a population of 75 RA patients over a 12-year period (3). They noted that the course of major disability in RA was established in the first few years of disease. The best predictors of disability included age, radiologic grade, sex, rheumatoid factor, nodules, and initial functional status. When a cross-sectional population study was conducted, it appeared that a tetrad composed of high early disability scores, female sex, older age, and abnormal radiographic findings at clinical presentation predicted the likelihood of developing significant disability (3). More recently, Pincus et al reported the results of a prospective study of morbidity and mortality in 75 RA patients in which quantitative measures of functional capacity were documented at baseline and 9 years later (4). There was >50% mortality at 5 years among patients who had severe dysfunction at baseline, as measured by questionnaire, grip strength, and modified walking time (time required to walk at a normal pace for 25 feet). Such a prognosis is comparable with that of patients with cardiovascular diseases and patients with malignant diseases. The higher relative risks associated with measures of poor functional capacity could not be explained by demographic, disease therapy, or comorbidity variables (4). These findings are consistent with those of Mitchell et al, who demonstrated that age and ARA functional class were the 2 most important predictors of mortality in the RA population (5). These studies show that knowledge of the patient’s baseline functional status allow the clinician to identify those patients with a poor prognosis. Thus, the information gathered using these instruments may provide clinically useful information regarding current RA status, utilization of health care services, and prediction of future outcome, such as mortality, that are not readily available through conventional testing (3-7). What is available? There is no single quality of life instrument that can be used to assess outcome in every situation. The instrument of choice depends on the purpose for which it will be used, the practicality of the instrument, and the population being studied. Recently, a classification scheme for health-related quality of life measures was proposed by Guyatt et a1 (8). It was designed to delineate the range of application and content of the various measures. The 2 major categories of this system are generic instruments and arthritis-specific instruments. Generic instruments. Generic instruments have been developed to reflect the impact of ill health on the lives of people in a wide variety of populations. They cover function, disability, and distress. Subcategories of the generic measures include health profiles, utility measures, and single-item self-rating health scales (8). As demonstrated in Tables 2 and 3, there are many different tools within each class of instrument. Representative examples from each class are summarized below. Health profiles. Health profiles are single instruments that measure different aspects of quality of life in a wide variety of conditions. A scoring system permits aggregation of the collected information into a score or index. Such instruments allow for assessment of the effects of an intervention on many aspects of quality of life through the use of only 1 instrument. The primary disadvantage of this method of determining quality of life is its potential insensitivity to disease-specific, clinically important change (8). Com- OUTCOME MEASUREMENT IN RA Table 2. 593 Content of generic and arthritis-specific questionnaires* Physical health Instrument Generic (health profiles) MHIQ Rand HIS SIP Arthritis-specific AIMS ARA FC FSI HAQ Katz Lee MACTARPET TQ Social health Mobility/ Leisure physical Self- Role CommuniSocial group Psychological activity care activity cation interaction activity health + + + + + + + + + + + + + + + + + + + + + + t + + + + + + o + + + + 0 + + + + + 0 + + + + + + + 0 0 + 0 0 0 0 0 0 + + 0 0 0 0 0 + O/ + + 0 + + + + 0 O/ + + + 0 * MHIQ = McMaster Health Index Questionnaire; + = property tested; Rand HIS = Rand Health Insurance Study; 0 = property not tested; SIP = Sickness Impact Profile; AIMS = Arthritis Impact Measurement Scales; ARA FC = American Rheumatism Association Functional Classification: FSI = Functional Status Index; HAQ = Health Assessment Questionnaire; Katz = Katz's Activities of Daily Living Instrument; Lee = Lee's Functional Status Instrument; MACTAWPET = McMaster-Toronto Arthritis Patient Preference Disability Questionnaire/Problem Elicitation Technique; TQ = Toronto Questionnaire. monly used health profiles include the McMaster Health Index Questionnaire (MHIQ), the Rand Health Insurance Study (HIS), and the Sickness Impact Profile (SIP) (9-1 I ) . The MHIQ is a SPitem, self-administered questionnaire that can be completed in 20 minutes. It examines physical, social, and emotional function. Function scores are standardized to index values Table 3. Measurement characteristics of generic and arthritis-specific questionnaires* Instrument Generic Health profiles MHIQ Rand HIS SIP Utility measures IWB PUMS Rating scale SG and rating chart ?TO WTP Arthritis-specific ACTRE AIMS AKA FC FSI HAQ Katz Lee MACTAWPET TQ Reliability Validity + -+ f + ? ? + + + ? + + + + + + + + ? + + + + + 7 + + + + + + + + + + + Mode of administrdtion Administration time (minutes) Interviewer Self-report Interviewer/self-report 20-40 Interviewer Interviewer Interviewer Interviewer ? 30 10 30-60 Interviewer Interviewer 30-60 ? Self-report Self-report Clinical judgement Interviewer Interviewer/self-report Clinical self-report Self-report Interviewer Interviewer 2a Minimal 60-90 10 Short Short 10-20 15-20 60 2&30 ? ~~~ * IWB = Index of Well Being; ? = property unknown; PUMS = Patient Utility Measurement Set; SG = standard gamble; TTO = time trade-off WTP = willingness to pay; ACTKE = National Institutes of Health Activity Record. See Table 2 for other definitions. BELL ET AL ranging from 0.0 (extremely poor function) to 1.0 (extremely good function). Retest reliability for the physical function score is 0.80 (12). MHIQ scores correlate well with global assessments by health professionals (13-15) or by relatives or friends (15) and with biologic parameters of disease severity (13,16). The Rand HIS batteries provide a detailed assessment of the following 4 areas: social function, psychological function, physical function, and general health perceptions. Completion of this intervieweradministered questionnaire requires 60 minutes. The Arthritis Impact Measurement Scales (AIMS) anxiety and depression scales evolved from the Rand HIS psychological measures (10). The SIP, a behavior-based measure of health status, has been shown to be of use in the study of people with arthritis (17). It was conceived as a measure of functional performance of the individual patient, but it came to be used as more of a health assessment tool for measuring health levels in a population. It has been used primarily as a health survey instrument, as well as an outcome measure for evaluating longitudinal treatment regimens and methods of delivering service. The focus of the SIP is dysfunction. Levels of positive function are not assessed. The SIP assesses 12 categories, which include ambulation, mobility, body care, movement, social interaction, communication, alertness behavior, sleep and rest, eating, work, home management, and recreation and pastimes. The questionnaire consists of 136 items. is self-administered. and takes 20 to 30 minutes to complete ( 1 1). Utility instruments. Utility measures of healthrelated quality of life were derived from economic and decision theories. They provide us with a quantitative measure of the value or preference the patients attach to their overall health status relative to perfect health (score of 1 ) and death (score of 0). Accordingly, changes in utility as a result of a specific intervention reflect changes in the value of an individual's health status, a key component in the evaluation of an intervention's usefulness in RA. They therefore reflect the trade-off between thc risks and benefits of an intervention. The use of utility meawres allows for a cost-utility analysis, which is advantageous to planners and policy makers. The acceptability, reliability, and validity of this approach have been documented in a range of disorders and patient groups. but to date, these measures have been used only once among patients participating in a placebo-controlled, randomized trial of auranofin (18-20). There are 2 approaches to utility measurement. The first approach is to classify patients into categories based on their responses to a number of questions about their function. The Quality of Well-Being (QWR), also known as the Index of Well-Being (IWB), and the Health Status Index (HSI) use this approach. The QWB determines what the patient did or did not do because of health reasons within 3 areas: mobility, physical activity, and social activity. It places health, disability, and death on a single scale. Each area has 4 or 5 levels of performance. Therefore, patients are classified as belonging to 1 of 43 possible combinations of levels. Each state of health has been valued using preference measurement techniques obtained from the activity profiles of normal individuals in the general population. These weights have also been evaluated in RA patients (21). The preference value is then assigned to each health state and modified by the presence or absence of a standard list of similarly valued symptoms. The overall QWB places patients on a utility scale that ranges from 0 (dead) to 1 (healthy). This approach to the measurement of health status has been used in a variety of conditions for health care planning. The QWB has been demonstrated to be a useful tool for measuring both the change in RA patient status and the cost-utility of an intervention such as total joint replacement (22). The QWB has also influenced the design of other global scales such as the Rand HIS and the disease-specific AIMS. The second approach to utility measurement is to ask patients to make a single rating of all aspects of their quality of life. Some representative tools using this approach include rating scales, the standard gamble (SG) method, time trade-off ('I'TO) techniques, and willingness to pay (WTP) (18-20,23,24). A rating scale consists of a line on a page with clearly identified end points. The most preferred health state is identified at one end and the less preferred at the other. Remaining health states are placed on a line between these 2 end points in order of preference. The intervals between placements correspond to differences in preference as perceived by the subject (18,19). The SG method is a paired comparison in which the subject must choose between 2 alternatives with varying probabilities of occurrence (23). The IT0 technique is valid only for health states that are preferred to death. It is simpler than the standard gamble. It is also a paired comparison in which the subject must choose between 2 alternatives, such as a shorter healthy life versus a longer chronic OUTCOME MEASUREMENT IN RA disease state. This instrument is based on the theory that the less desirable the health state, the greater the amount of life the subject will trade off in order to be free of the undesirable health state. When the point of indifference is found, the utility for the health state can be calculated (18,19). The WTP questionnaire has been used to evaluate health gains. Patients with RA have been asked how much of their income they would pay and how large a mortal risk they would accept to achieve a hypothetical cure. In one study, the majority of patients were willing to pay 22% of their household income for a cure (24). Utility measures are potentially useful in rheumatology to help determine the acceptable risk of drug treatment. Factors that determine if a risk is acceptable include the seriousness of the risk. the benefit that the drug is expected to produce, and the available alternative treatments. Torrance has compared 3 techniques with regard to ease of use in arthritis patients (reliability, criterion and construct validity, and cost), and has suggested that the best instrument for use in these patients is the rating scale, followed by the time trade-off and the standard gamble (19). To date, the only published drug trial using utility measures is the auranofin multicenter trial of 1986 (20). The instruments used included the Patient Utility Measurement Set (PUMS), standard gamble, and willingness to pay. The PUMS is a detailed questionnaire administered by a trained interviewer. The patient is asked to compare “full health” with his current state of health and his recollected perception of his health state pretrial. Components of the instrument present sacrifice or risk in a variety of terms. including percent chance, months per year, and years of life. In the auranofin trial, the PUMS performed better in detecting statistically significant differences between treatment groups than did the 2 simpler utility questionnaires. The PUMS was the only instrument of the 3 utility instruments used that took into account the pretrial state. It was also the only preference instrument that had detailed and standardized instructions and method of administration for the interviewer. This may explain the difference in ability to demonstrate statistically significant differences between treatment groups. Further development and testing of these instruments for use in clinical trials in the rheumatic diseases is necessary (20). Thc applicability of utility instruments to the clinical setting has yet to be determined. Certainly, a 595 measure that combines risks and benefits of treatment into a single value, as the utility measure does, would theoretically be clinically useful. Utilities provide us with a common unit of health impact. They therefore permit cost-utility analysis among alternative programs with different effects. This type of comparison may be used by people involved in health services resource allocation and policy making to determine the overall value of therapeutic interventions in the patient population as a whole. Arthritis-specific instruments. Specific instruments provide more specialized information in a concise way. Arthritis-specific measures focus on aspects of health targeted toward arthritis and arthritis symptom complexes. Frequently used arthritis-specific measures of health status include the AIMS. the American Rheumatism Association functional class, the Functional Status Index (FSI), the Health Assessment Questionnaire (HAQ), the Lee functional status instrument (Lee), and the Toronto Questionnaire (TQ) (25-31). The AIMS is a self-administered questionnaire that assesses physical, emotional, and social wellbeing. Within the questionnaire, information about the severity of disease, health perception, other significant illnesses, and sociodemographics is also gathered. In intervention studies, while correlations have been shown across dimensions, it appears that 3 dimensions (physical function, psychological function, and pain) warrant individual assessment (25). When used in tandem with conventional clinical outcome measures, the AIMS is able to detect clinically meaningful differences between drug-treatment groups. The ARA functional classification system was developed by Steinbrocker in 1937 to classify the RA population for the purpose of research studies (26). Change in functional ability has been detected using this instrument in controlled trials of RA therapy. However, lack of sufficient detail of individual activities limits its use for serial quantification of function. The FSI was developed as both a clinical and an evaluative tool to assess quality of life in geriatric arthritis patients in the community. Forty-five activities of daily living are assessed 3 times for ratings of dependence, difficulty, and pain. The € 3 1 has been shown to be valid and reliable, but details about sensitivity to clinically important change have yet to be established (27). Interviewers require 60-90 minutes to complete the administration of this questionnaire. The HAQ is a self-administered questionnaire 596 that measures 4 dimensions: disability, discomfort, drug side effects, and dollar costs (28.29). Although it was initially devised to assess functional capacity in RA, it has subsequently been demonstrated to be useful in a variety of other diseases (refs. 28 and 29, and Fries JF: unpublished observations). In addition to evaluating activities of daily living (ADL) components, the HAQ examines the level of difficulty the patient is experiencing with these ADL and the degree of assistance (from people or devices) required by the patient. The modified HAQ, an 8-question tool reduced from 20 questions in the parent instrument, incorporates the assessment of patient satisfaction. It has indicated that patient dissatisfaction may determine the number of patient visits to the physician and physician therapeutic intervention (32). The Lee instrument measures both self-reported activity performance and degree of difficulty encountered with activity. This instrument produces statistically significant correlations with familiar outcome measures, such as joint count, grip strength, walking time, and functional class (33). The TQ, first developed in 1979, assesses function in personal care, upper extremity activities, mobility, work, and leisure activities. Rehabilitation paramedical personnel and rheumatologists created the weighting scale for responses (31). There are 2 newly developed tools designed to assess the quality of life of the individual arthritis patient. They include the McMaster-Toronto Arthritis Patient Preference Disability Questionnaire (MACTAR) and the National Institutes of Health Activity Record (ACTRE) (34-37). The MACTAR consists of both a patient preference questionnaire with which patients rank their functional activities in order of importance, and a global question about improvement in arthritis status. By assessing mobility, self-care, work, and leisure, the emphasis of the MACTAR is primarily on physical and social function. Clinically important change in function has been demonstrated by serial assessment of RA patients in a controlled trial (34). This questionnaire can be used only by a trained interviewer. We have developed a modification of the MACTAR, the Problem Elicitation Technique (PET), to incorporate the dimension of the level of difficulty or frequency for each identified patient problem into the MACTAR patient-generated problem list. The administration time is 10-20 minutes. Recognized limitations of the MACTAR include its unconventional method of calculating change and a current lack of knowledge of BELL ET AL the reliability and stability of patient preferences during a stable functional period. It is possible that the addition of priority designations to existing, wellstudied and standardized tools may provide the same information as the MACTAR without the additional cost in time and personnel required to perform the structured interview (38). The ACTRE is a recently developed tool in which the patient documents his or her level of physical activity in half-hour blocks of time, for 2 consecutive weekdays (35-37). Activities are categorized as rest, physically active periods, sleep, and planning and preparation. Each activity is rated for the level of pain or fatigue, level of difficulty, and value to self and others. An index of physical activity may be calculated by totaling the number of rest periods and dividing this figure by the total hours of physical activity. The ratio reflects the balance of rest with activity. When used in a prospective, randomized trial of an occupational therapy educational intervention, the experimental group increased the activity index and achieved a better balance between rest and activity than the control group. This finding approached statistical significance (P = 0.07) in a small sample (35,37). Additional details of its measurement characteristics and usefulness both in clinical trials and practice settings are pending (36). Further development within the field of functional status evaluation is required. How could I use them? Although clinicians routinely evaluate the quality of life of their patients, this evaluation tends to be informal. Validated quality of life instruments have not, as yet, been used extensively in clinical practice. Potential reasons for limited use include concerns about the time and cost for administering these questionnaires, clinician inability to interpret the scores and use the scores clinically, and a belief that the information provided by such tools is available through conventional testing. Information provided by quality of life instruments has been shown to be useful in clinical practice for determining prognosis. The data may also be used for the targeting of interventions. Intervention targeting is made possible by using the results of these measures to identify current patient problems, define needs, set therapeutic priorities, and direct interventions. For example, a patient with polyarthritis and an unstable, painful knee may identify stair climbing as an area of difficulty. He or she may be unable to indepen- OUTCOME MEASUREMENT IN RA dently climb stairs to the second level of his or her home, where the bedroom and bathroom are located. This may be reflected in low scores in the areas of independence, self-esteem, and functional level. Such traditional measures as the joint count would not necessarily identify the importance of the knee as the target for therapy. Therapeutic objectives may include pain relief, stabilization of the affected knee with physiotherapeutic exercise and resultant muscle strengthening, external bracing or performance of a surgical procedure, and/or environmental adaptation. Continuous monitoring of functional status during the course of disease and therapy will assist in the evaluation of the effect of each specific intervention. The clinical usefulness of 2 functional status instruments, the AIMS and the HAQ, has recently been described. Wolfe et a1 administered the patient function portion of the HAQ to a cohort of 400 RA clinic patients over a period of 3.1 years (39). The questionnaire was self-administered and sent to the patient population by mail every 6 months. Data on health care use and cost were obtained, with patient permission, from institutions and physicians. The patient function subsection of the HAQ correlated well with traditional clinical variables, psychological variables, and the erythrocyte sedimentation rate. This section reflected change over time, similar to the change reflected by the clinical variables. It was easy to administer, and took the patient approximately 13 minutes to complete. The scoring system was simple and brief. The provision of a classification description with which the scores could be compared made the test scores comprehensible and clinically useful. The clinical use of information provided by the AIMS instrument has also undergone initial testing. Kazis et a1 have developed and tested an AIMS summary patient profile format (6). Two hundred rheumatologists and 200 allied health professionals evaluated 10 patient profiles. The profile was a computer-generated, single-page, health status summary report. Each patient profile consisted of the 9 health status scales, the mean score on each scale for the RA population, and comments highlighting key tasks with which the patient was having difficulty. The profile format was shown to be reliable and valid. The profiles were understandable and permitted interpretation of health status across a spectrum of dysfunction. The impact of the knowledge of patient health status on clinical practice has yet to be determined. We await the results of a randomized controlled trial using this tool. 597 The usefulness of the results provided by quality of life instruments for the targeting of interventions is theoretically sound and is currently being tested. The assistance these measures provide in the determination of prognosis has been demonstrated and bodes well for their use in the future. How well do these instruments perform when their measurement characteristics are tested? The content strengths and weaknesses of the various arthritis-specific questionnaires and general health questionnaires are outlined in Tables 2 and 3. The AIMS and the SIP have the broadest coverage with regard to content. The format of the selfadministered questionnaires (AIMS, HAQ, and SIP) is useful for the office practice. The patient could complete the questionnaire in the reception room while awaiting review. The requirement of an interviewer for the MACTAR and PET is a drawback for these individual measures. However, it may be worth the expense, with regard to human and monetary resources, to document the disability most important to the specific patient. Within the population of patients studied by use of these questionnaires, there has been good acceptability. Comparative analysis of the AIMS, HAQ, QWB (IWB), and SIP with the standard FSI (7), and the Lee, MHIQ, and MACTAR with the joint count (40) using relative efficiencies (change relative to the standard measure) has demonstrated that they are all capable of detecting change. There are minor differences in their ability to detect this change in the different dimensions. Both the AIMS and the SIP measure functional change equally well, and the HAQ and SIP are better measures for determining change in social function than is the AIMS (7). Until results are reported showing that one instrument is superior to the others for the individual patient, it would seem reasonable for the practicing rheumatologist who is following a group of office or clinic patients to select the instrument with the items that best reflect his or her own priorities. How should I evaluate any new quality of life measure? A list of methodologic considerations has been developed that can be used to compare and evaluate the usefulness of quality of life instruments (41). These 598 Table 4. Methodologic considerations for quality of life instruments 1. Purpose 2. Comprehensiveness 3. Credibility 4. Accuracy 5 . Sensitivity to change 6. Biologic sense 7. Feasibility guidelines have proven to be useful in surveying previous work and have helped identify research questions that remain to be addressed (Table 4). Purpose and comprehensiveness. When evaluating a new instrument, the first question the clinician should address is “Is the purpose of the instrument clearly stated?” The components being evaluated should be obvious, the target population defined, and the disease for which the measure was developed should be outlined. The second question is “Are all relevant dimensions of function included?” The content of the instrument should reflect those areas of interest to the clinician and the patients. Credibility. In evaluating the credibility of the instrument, it is important to ask “Does the instrument possess face validity?” The method of measuring the outcome of interest should seem reasonable and possible. The authors should have attempted to show that the instrument is measuring what it is intended to measure. Questions should be specific and should refer to a designated time period. The recall time should also be appropriate to that of the age group being addressed. “Do the questions assess level of ability at a particular point in time or change in ability over time?” When looking for the effects of an intervention on patient outcome, a measure of change over time is most appropriate. “Are the questions capacityoriented or performance-oriented?’’ Capacity-oriented questions ask the patient about his or her ability to perform a task. Performance-oriented questions ask the individual if he has actually completed a specific task. Performance questions are a more accurate reflection of an individual’s actual function. However, capacity questions may be more meaningful in practice. “Are the responses to individual questions aggregated into a summary score?” A summary score may allow the clinican to compare the individual patient with the overall patient population, as well as to numerically compare individual patient status over time. Interpretation of summary scores requires ac- BELL ET AL cess to group means for the normal and diseased population. The last question relating to credibility is “Was this assessment tool developed and pretested for the purpose for which you want to use it?” Tools developed with the same purpose for which you want to use them will perform better for you. Accuracy. To determine the accuracy of a measure, you must first ask “Are the results reproducible?” When evaluating the new instrument, one should look for results on test-retest and intraobserver-interobserver reliability. Next, you should ask “When compared to a more accurate measure, does it perform satisfactorily?” The clinician would be wise to determine if the data obtained with the new instrument have been compared with those obtained from an established, more accurate or similar instrument. Sensitivity to change. To determine this feature of a measure, it should be asked “Is it sensitive enough to detect clinically important changes across patients and within patients?” To answer this question, it is necessary to know the results of empirical studies using the new instrument plus another instrument in patients with the disease in question and with established change over time. Biologic sense. “Does it perform satisfactorily when compared with another similar assessment?” The instrument should be tested in conjunction with a more accurate measure of change, such as a change in clinical parameters. Feasibility. The feasibility of using any new instrument in clinical practice may be determined by positive answers to the following questions. “Is the format for administration appropriate for your purpose?” “Is the time to administer the questionnaire appropriate for your purpose?” “Are the questions easy to understand and acceptable to the patients and the interviewers?” These questions are offered as guidelines for the clinician in the position of reviewing a new or existing instrument and will assist him or her in the endeavor of choosing an instrument appropriate to their situation. Is the state of the art sufficiently advanced that drug approval agencies should require quality of life data before approval? Cananadian and American drug approval agencies currently do not require assessment of the impact upon quality of life in the “pivotal” studies of arthritis patients that are used for registration of the drug. A global assessment of health status completed by both 599 OUTCOME MEASUREMENT IN RA the patient and the clinician is required, but the only functional end point usually included is the Steinbrocker scale. This scale was designed in the 1930s, and it does not satisfy any of the key psychometric properties such as sensitivity/responsiveness to change. This major drawback has led to the revision of the Steinbrocker instrument by the American College of Rheumatology subcommittee on functional status and disease severity in RA. A strong argument can be made that the addition of a validated quality of life scale to all clinical trials would complement the anthropomorphic, c h i cal, and laboratory data. This was recommended by a number of investigators in the field at a recent conference (September 1988 in Washington, DC) sponsored by the Arthritis Foundation, which focused on the development of disease-modifying antirheumatic drugs. Health status is as important or is more important to the clinicians and patients than the traditional. currently required end points. Evidence is accumulating that health status is as important as other end points in predicting long-term outcome, such as progression of disability and death. The end points currently in use tell only part of the story. For example, one of the end points used by most approval agencies is improvement in the joint count. What they fail to consider, however, is that improvement in 1 important joint that affects functional activity (such as the hip) may have a greater impact on the patient’s functional capacity and quality of life than would improvement in 8 of the less important joints. As described above, there are a number of both disease-specific and generic health questionnaire instruments that are as quantitatively accurate and useful as the traditional end points currently used. A further argument can be made that both disease-specific and generic transcondition questionnaires should be included, for the following reasons. General health status instruments are essential for comparisons across conditions, especially for the use of third-party payors. Disease-specific instruments are important so that insensitivity to change that results from the inclusion in generic instruments of items of low frequency or irrelevance to arthritis, which leads to statistical noise, can be avoided. The investigator setting up new studies should be free to select any of the validated instruments on the basis of the riskresponse profile of his or her patients and the environment or the intervention. This addition should complement the anthropomorphic, clinical, and laboratory data. What will the future bring in the way of further development of these measures? Reviews of currently available instruments have led prominent methodologists such as N v an Feinstein to identifying some major shortcomings of the quality of life questionnaires. These include 1) lack of attention to the role of the patient in the effort or support of the performance being measured, 2) using the same Scale for aSSeSSmentS that really require different types of expression, 3) assuming the suitability of a profile or an aggregated index for all situations, 4) incompletely evaluating the measurement characteristics of old and new measurement tools, and 5 ) applying the tool in a setting or for a purpose for which it was not designed (42). Feinstein and colleagues’ concern regarding the lack of attention to patient priorities and preferences has been addressed by the evolution of the MACTAR, PET, and ACTRE patientspecific questionnaires, and the PUMS, SG, TTO, and WTP utility questionnaires. In addition to these drawbacks, other shortcomings can be outlined. Items included in these questionnaires may not show any level of impairment present due to the disease or condition of interest. For example, if arthritis affects only the upper limbs, items that relate to the lower limbs, such as mobility, are not impaired, and there is therefore no room for improvement. Also, the items that are present may not have the potential to improve with the intervention of interest. Disability resulting from severe joint damage without active inflammation will not respond to antiinflammatory or cytotoxic medication. Clinicians have also noted that the results from these instruments are not presented in a format that is understandable to both themselves and to the policy makers. Finally, the applicability of the results to the individual patient and in clinical practice is unclear. In the years to come, the predictors of good and bad functional responses will be determined. The use of alternative study designs for assessing the impact of therapy upon the quality of life may be necessary. Different instruments may have different strengths and weaknesses when used in single-subject studies, surveys, case-control, and controlled trials. In addition, the presentation of results requires reevaluation. Currently, drug development studies, even those with quality of life data, are oriented to aggregating efficiency data and to disaggregating toxicity data. The complement to each of these is the need for further quality of life studies. Approaches that allow the clinician to compare benefi- BELL ET AL 600 cial with detrimental effects upon the function of an intervention need investigating. Summary In 40 years of development in the area of quality of life, the goal of applicability to the individual patient has not been accomplished. During the 1980s, we strived to improve the applicability of these instruments by refining disease-specific measures and developing patient-specific measures so that the sensitivity of these tools to clinically important change could be increased and comparative indices across conditions could be established. Finding the balance between brevity, reliability, and comprehensiveness will improve practicality. The reliability of serial measurements using the various instruments in individual patients and in small groups of patients needs to be established. In the absence of a gold standard, validity will continue to be derived from testing new measures against accepted clinical measures. The ideal tool for use in clinical practice has not yet been developed. At this time, the clinician may choose among the many reliable and valid questionnaires assessing functional status, health status, and utility, according to his or her purpose. The information gathered from these instruments may help identify patients’ problems, set treatment priorities, direct interventions, monitor the longitudinal course of disease, and assist in program evaluation and policy planning. REFERENCES 1. World Health Organization: The Ten Years of the World Health Organization. Geneva, 1958 2. Liang MH, Cullen KE, Larson MG: Measuring function and health status in rheumatic disease clinical trials. Clin Rheum Dis 9531-539, 1983 3. Sherrer YS, Bloch DA, Mitchell DM, Roth SH, Wolfe F, Fries JF: Disability in rheumatoid arthritis: comparison of prognostic factors across three populations. J Rheumatol 14:705-709, 1987 4. Pincus T, Callahan LF, Vaughn WK: Questionnaire, walking time, and button test measures of functional capacity as predictive markers for mortality in rheumatoid arthritis. J Rheumatol 14:24&251, 1987 5. Mitchell DM, Spitz PW, Young DY, Bloch DA, McShane DJ, Fries JF: Survival, prognosis, and causes of death in rheumatoid arthritis. Arthritis Rheum 29:706 714, 1986 6. Kazis LE, Anderson JJ, Meenan RF: Health status information in clinical practice: the development and testing of patient profile reports. J Rheumatol 15: 338-344, 1988 7. Liang MH, Larson MG, Cullen KE, Schwartz JA: Comparative measurement efficiency and sensitivity of five health status instruments for arthritis research. Arthritis Rheum 28542-547, 1985 8. Guyatt GH, van Zanten SJOV, Feeney DH, Patrick DL: Measuring quality of life in clinical trials: a taxonomy and review. Can Med Assoc J 140:1441-1448, 1989 9. Chambers LW: The McMaster Health Index Questionnaire: Assessment of Quality of Life in Clinical Trials of Cardiovascular Therapies. Edited by NK Wenger, ME Mattson, CD Furberg, J Elinson. New York, Lejacq Publishing, 1984 10. Brook RH, Ware J E Jr., Davies-Avery A, Stewart AL, CA, Rogers WH, Williams KN, Johnston SA: ’ Donald Overview of adult health status measures fielded in Rand’s Health Insurance Study. Med Care 17 (suppl 7):l-131, 1979 11. Bergner M, Babbitt RA, Pollard WE: The Sickness Impact Profile: validation of a health status measure. Med Care 1457-67, 1976 12. Fortin F, Kenonac S: Validation of questionnaires on physical function. Nurs Res 26:128-135, 1977 13. Chambers LW, MacDonald LA, Tugwell P, Buchanan WW, Kraag G: The McMaster Health Index Questionnaire as a measure of quality of life for patients with rheumatoid disease. J Rheumatol 9:780-784, 1982 14. Chambers LW, Sackett DL, Goldsmith CH, MacPherson AS, McAuley RG: Development and application of an index of social function. Health Serv Res 11:430441, 1976 15. Chambers LW: The McMaster Health Index Questionnaire (MHIQ): Methodologic Documentation and Report of Second Generation of Investigations. Hamilton, Ontario, McMaster University Department of Clinical Epidemiology and Biostatistics, 1982 16. Harper AC, Taylor DA, Chambers LW, Cino PM, Singer J: Physical and psychosocial disability in multiple sclerosis: an epidemiological survey of patients in a regional clinic. J Chronic Dis 39:305-310, 1986 17. Deyo RA, Inui TS, Leininger JO, Overman SS: Measuring functional outcomes in chronic disease: a comparison of traditional scales and a self-administered health status questionnaire in patients with rheumatoid arthritis. Med Care 21:180-192, 1983 18. Torrance G: Measurement of health status utilities for economic appraisal. J Health Econ 5 : 1-30, 1986 19. Torrance G: Utility approach to measuring healthrelated quality of life. J Chronic Dis 40593-600, 1987 20. Bombardier C, Ware J, Russell IJ, Larson M, Chalmers A, Read JL: Auranofin therapy and quality of life in patients with rheumatoid arthritis. Am J Med 81565578, 1986 OUTCOME MEASUREMENT IN RA 21. Balaban DJ, Sagi PV, Goldfarb NI, Nettler S: Weights for scoring the QWB instrument among RA patients: a comparison to general population weights. Med Care 24~973-980, 1986 22. Liang MH, Cullen KE, Larson MG, Thompson MS, Schwartz JA, Fossel AH, Roberts WN, Sledge CB: Cost-effectiveness of total joint arthroplasty in osteoarthritis. Arthritis Rheum 29:937-943, 1986 23. R a 8 a H: Decision Analysis: Introductory Lectures on Choices under Uncertainty. Reading, MA, AddisonWesley, 1968 24. Thompson MS: Willingness to pay and accept risks to cure chronic disease. Am J Public Health 76:392-396, 1986 25. Meenan RF, Gertman PM, Mason JM: Measuring health status in arthritis: the Arthritis Impact Measurement Scales. Arthritis Rheum 23: 146-152, 1980 26. Steinbrocker 0, Traeger CH, Batterman RC: Therapeutic criteria in rheumatoid arthritis. JAMA 140:65%662, 1949 27. Jette AM: Functional capacity evaluation: an empirical approach. Arch Phys Med Rehabil61:85-89, 1980 28. Fries JE, Spitz PW, Young DY: The dimensions of health outcomes: the Health Assessment Questionnaire, Disability and Pain Scales. J Rheumatol9:789-793, 1982 29. Fries JF, Spitz P, Kraines RG, Holman HR: Measurement of patient outcome in arthritis. Arthritis Rheum 231137-145, 1980 30. Lee P, Jasani MK, Dick WC, Buchanan WW: Evaluation of a functional index in rheumatoid arthritis. Scand J Rheumatol 2:71-77, 1973 31. Helewa A, Goldsmith CH, Smythe HA: Independent measurement of functional capacity in rheumatoid arthritis. J Rheumatol 9:794-797, 1982 32. Pincus T , Summey JA, Soraci SA Jr, Wallston KA, Hummon NP: Assessment of patient satisfaction in activities of daily living using a modified Stanford Health Assessment Questionnaire. Arthritis Rheum 26: 13461353, 1983 33. Liang MH, Jette AM: Measuring functional ability in 60 1 34. 35. 36. 37. 38. 39. 40. 41. 42. chronic arthritis: a critical review. Arthritis Rheum 24: 80-86, 1981 Tugwell P, Bombardier C, Buchanan WW, Goldsmith CH, Grace E, Hanna B: The MACTAR patient preference disability questionnaire: an individualized functional priority approach for assessment of improvement in physical disability in clinical trials in rheumatoid arthritis. J Rheumatol 14M6-451, 1987 Gerber L, Furst G, Shulman B, Smith C, Thornton B, Lian M, Cullen K, Stevens MB, Gilbert N: Patient education program to teach energy conservation behaviors to patients with rheumatoid arthritis: a pilot study. Arch Phys Med Rehabil68:442-445, 1987 Gerber L, Furst G: Quantitative measurement of pain and fatigue associated with routines of life activities (abstract). Arthritis Rheum 33 (suppl 4):S149, 1989 Furst GP, Gerber HL, Smith CC, Fisher S, Shulman B: A program for improving energy conservation behavior in adults with rheumatoid arthritis. Am J Occup Ther 41:102-111, 1987 Meenan RF, Pincus T: The status of patient status measures. J Rheumatol 14:411414, 1987 Wolfe F, Kleinheksel SM, Cathey MA, Hawley DJ, Spitz PW, Fries JF: The clinical value of the Stanford Health Assessment Questionnaire Functional Disability Index in patients with rheumatoid arthritis. J Rheumatol 15:1480-1488, 1988 Tugwell P, Bombardier C, Buchanan WW, Goldsmith CH, Grace E, Bennett KJ, Williams HJ, Egger M, Alarcon GS, Guttadauria M, Yarboro C, Polisson RP, Szydlo L , Luggen ME, Billingsley LM, Ward JR, Marks C: Methotrexate in rheumatoid arthritis: impact on quality of life assessed by traditional standard-item and individualized patient preference health status questionnaires. Arch Intern Med 150: 59-62, 1990 Bombardier C, Tugwell P: Methodologic considerations in functional assessment. J Rheumatol 14[Suppl]:6-10, 1987 Feinstein AR, Josephy BR, Wells CK: Scientific and clinical problems in indexes of functional disability. Ann Intern Med 105:413-420, 1986 New Manuscripts to be Sent to Dr. Peter H. Schur Dr. Peter H. Schur will officially assume the full responsibilities of Editor, Arthritis and Rheumatism, on July 1, 1990. As part of the transition from the Editorship of Dr. William J. Koopman, Dr. Schur will handle the review process for all new manuscripts submitted after April 1, 1990. Manuscripts should be sent to Peter H. Schur, MD, Editor, Arthritis and Rheumatism,Arthritis and Rheumatism Editorial Office, Room 422, Richardson Fuller Building, 221 Longwood Avenue, Boston, MA 02115.