Reliability and sensitivity to change of the OMERACT rheumatoid arthritis magnetic resonance imaging score in a multireader longitudinal setting.код для вставкиСкачать
ARTHRITIS & RHEUMATISM Vol. 52, No. 12, December 2005, pp 3860–3867 DOI 10.1002/art.21493 © 2005, American College of Rheumatology Reliability and Sensitivity to Change of the OMERACT Rheumatoid Arthritis Magnetic Resonance Imaging Score in a Multireader, Longitudinal Setting Espen A. Haavardsholm,1 Mikkel Østergaard,2 Bo J. Ejbjerg,2 Nils P. Kvan,1 Till A. Uhlig,1 Finn G. Lilleås,1 and Tore K. Kvien1 scores (median 0.80 for synovitis, 0.96 for erosion, and 0.97 for edema). The SDDs were generally low, suggesting a high potential to detect changes. Interreader single-measure ICCs were high for status scores (mean baseline and followup 0.69 and 0.78 for synovitis, 0.83 and 0.73 for erosion, and 0.79 and 0.95 for edema) and for change scores (mean 0.74 for synovitis, 0.67 for erosion, and 0.95 for edema). The average-measure ICCs were >0.94 for all components of both the status scores and change scores. Conclusion. The RAMRIS showed very good intrareader reliability, good interreader reliability, and a high level of sensitivity to change. The results suggest that the RAMRIS may be a suitable system for use in monitoring joint inflammation and destruction in RA. Objective. To assess the intra- and interreader reliability and the sensitivity to change of the Outcome Measures in Rheumatology Clinical Trials (OMERACT) Rheumatoid Arthritis Magnetic Resonance Imaging Score (RAMRIS) system on digital images of the wrist joints of patients with early or established rheumatoid arthritis (RA). Methods. Ten sets of baseline and 1-year followup MR images of the wrists of patients with progressive changes on conventional hand radiographs were scored independently by 4 readers on 2 consecutive days, preceded by reader training and calibration. The MR images were acquired and scored according to the recommendations from the OMERACT MRI group. The intra- and interreader agreement (evaluated by intraclass correlation coefficients [ICCs]) and the sensitivity to change (evaluated by the smallest detectable difference [SDD]) were determined for scores of synovitis, erosion, and bone marrow edema status and for change scores. Results. Intrareader ICCs were generally very high, both for status scores (median baseline and followup 0.89 and 0.90 for synovitis, 0.91 and 0.90 for erosion, and 0.90 and 0.98 for edema) and for change Rheumatoid arthritis (RA) is a chronic, multisystem inflammatory disease with a variable disease course. Joint damage, as visualized on conventional radiographs, is a key end point in RA (1,2), but may be a late manifestation of the disease (3–5). Thus, more sensitive imaging techniques are desirable (6,7). Magnetic resonance imaging (MRI) is a noninvasive tomographic imaging technique that can produce cross-sectional images in any plane, without morphologic distortion or magnification. The projectional superimposition that is a problem with conventional radiography can be avoided with MRI because of its multiplanar capabilities. MRI is the only noninvasive technique that allows simultaneous examination of all components of the diarthrodial joint, including soft tissues, articular cartilage, and bone, without ionizing radiation and adverse effects. Thus, the broad applications of MRI give it large potential as an outcome measure in RA. Whereas conventional radiography visualizes the structural Supported in part by The Research Council of Norway, The Norwegian Rheumatism Association, The Norwegian Women Public Health Association, Grethe Harbitz Legacy, and Marie and Else Mustad’s Legacy. 1 Espen A. Haavardsholm, MD, Nils P. Kvan, MD, Till A. Uhlig, MD, PhD, Finn G. Lilleås, MD, Tore K. Kvien, MD, PhD: Diakonhjemmet Hospital, Oslo, Norway; 2Mikkel Østergaard, MD, PhD, DMSc, Bo J. Ejbjerg, MD, PhD: Copenhagen University Hospitals at Hvidovre and Herlev, Copenhagen, Denmark. Address correspondence and reprint requests to Espen A. Haavardsholm, MD, Department of Rheumatology, Diakonhjemmet Hospital, Box 23 Vinderen, N-0319 Oslo, Norway. E-mail: firstname.lastname@example.org. Submitted for publication March 24, 2005; accepted in revised form September 13, 2005. 3860 RELIABILITY OF THE OMERACT RAMRIS SYSTEM changes that are a cumulative result of preceding disease activity, MRI allows direct visualization and assessment of synovitis, the primary lesion in RA. MRI also allows assessment of bone marrow edema, a frequent feature in both early and established RA that is a predictor of future radiographic damage and functional outcome (8–11). MRI is reported to detect RA erosive change with greater sensitivity than conventional radiography, and to document changes in structural damage over a shorter period of time (7,12–15), but formal studies addressing its sensitivity to change are scarce. The Outcome Measures in Rheumatology Clinical Trials (OMERACT) Rheumatoid Arthritis Magnetic Resonance Imaging Score (RAMRIS) system was developed to evaluate inflammatory and destructive changes in the hands and wrists of patients with RA, and was endorsed at the sixth OMERACT meeting as a useful framework for further development of MRI assessment of RA (16). It was suggested that the system be used as a standard comparator for new/alternative MRI methods of RA assessment, and further testing in longitudinal studies was encouraged (17). Few studies have previously examined the intra- and interreader reliability of the RAMRIS system (18,19). Only one study has examined the reliability of change scores in the RAMRIS system from a longitudinal perspective (20), and no study has examined the reliability when utilizing modern digital systems for evaluation of the images. The objective of the present study was to assess the intra- and interreader reliability and the sensitivity to change (responsiveness) of the OMERACT RAMRIS on digital images in a multireader, longitudinal setting in RA patients with either early or established disease. PATIENTS AND METHODS Patient and image selection. Sets of MR images (baseline and 1-year followup) of wrist joints from 10 patients with RA were evaluated independently by 4 readers at 2 time points using the OMERACT RAMRIS system (16). To identify candidate sets of images, radiographs from 60 RA patients with either early or established disease who were enrolled in a 1-year longitudinal observational study were screened for radiographic progression. These radiographs were assessed semiquantitatively (by NPK and EAH) with regard to progression/nonprogression, taking into account both erosions and joint space narrowing. Pairs of MR images of the dominant wrist of 10 patients (4 with early RA and 6 with established RA) that showed progression on conventional hand radiographs were selected for the study. The median interval between the first and second scan was 12 months (range 12–14 months). 3861 Readers. The 4 readers had different levels of experience. Two of us (MØ and BJE) were experienced readers who were familiar with the OMERACT RAMRIS system and who had taken part in previous OMERACT exercises assessing the RAMRIS. One of us (EAH) had some experience with the RAMRIS, while the fourth reader (NPK) was familiar with reading of MR images but did not have any previous experience with the RAMRIS. The 4 readers met for 1 day 4 weeks prior to the exercise, to review scoring methods and for initial training of the 2 least-experienced readers. Image evaluation. The readings were performed by the 4 readers over 2 days. The paired images were read in chronologic order. A technician coded the image sets and removed patient names. This procedure was repeated for a second reading on the consecutive day, with rearrangement of the image sets in a different order and with a different coding. The MR images were read on large-screen (21-inch) radiologic workstation monitors using a standard PACS software program (SECTRA IDS5; Uppsala, Sweden). This software package provides the readers with advanced features of image viewing, allowing the reader to adjust window/level settings, to zoom in/out, and to use a localizer that allows the accurate placement of specific lesions in 2 planes (axial and coronal), with the opportunity to measure distances and areas accurately. All readers evaluated the images independently at 4 different workstations in 4 separate locations. The score sheets from day 1 were sealed in envelopes until the second reading was completed. The readers recorded the time consumed scoring each image set. The complete scoring of images from 1 patient (baseline and followup images) took an average of 33 minutes, ranging from 15 minutes to 55 minutes. MRI sequences. MRI of the dominant wrist was performed at baseline and at 1 year, using a GE Signa 1.5T MRI scanner (General Electric Signa, Milwaukee, WI) with a dedicated high-resolution wrist phased array coil. The same scanner and wrist coil were used for both examinations. The hand was placed in the wrist coil at the patient’s side with the coil anchored to the base tray to reduce motion artefacts. The MRI sequences in this study included the OMERACTrecommended MRI core set of sequences (16) plus an additional 3-dimensional spoiled gradient-recalled acquisition in the steady state sequence for more detailed assessment of cartilage and bony changes. The image sequences were tested in a pilot study and developed in collaboration with an experienced MR radiologist (FGL) and a product specialist from GE. Details of the sequences are given in Table 1. Experienced technicians reviewed the images immediately after acquisition, and a sequence was reobtained if the quality was not acceptable. MRI scoring system. The OMERACT MRI group consensus on MRI definitions of important pathologic features in RA joints (16) were used in this study. The semiquantitative assessment system suggested by the OMERACT MRI group (16) was used to assess specific components. Bone erosions. Each wrist bone (carpal bones, distal radius, distal ulna, and metacarpal bases; total of 15 sites) was scored separately. The scale was 0–10, based on the proportion of eroded bone compared with the assessed bone volume, judged on all available images; scores were in increments of 10, so that 0 ⫽ no erosion, 1 ⫽ 1–10% of bone eroded, 2 ⫽ 11–20% of bone eroded, and so forth. The assessed bone 3862 HAAVARDSHOLM ET AL Table 1. Details of the magnetic resonance imaging sequences* Coronal T1 FSE Flip angle, degrees TR, msec TE, msec ST, mm Gap, mm NEx FOV, mm Matrix Time, minutes Precontrast Postcontrast Coronal STIR, precontrast 90 500 14 2.5 0.5 2 100 ⫻ 100 320 ⫻ 256 3.19 90 500 14 2.5 0.5 2 100 ⫻ 100 320 ⫻ 256 3.19 90 3,420 12 2.5 0.5 2 100 ⫻ 100 288 ⫻ 192 5.40 Axial T1 SE Precontrast Postcontrast 3-D SPGR, precontrast 90 400 13 3.0 0.5 2 100 ⫻ 100 512 ⫻ 320 4.19 90 4,000 13 3.0 0.5 2 100 ⫻ 100 512 ⫻ 320 4.19 10 55 10 1.5 0.0 1 80 ⫻ 80 256 ⫻ 256 7.34 * T1 FSE ⫽ T1-weighted fast spin-echo; T1 SE ⫽ T1-weighted spin-echo; 3-D SPGR ⫽ 3-dimensional spoiled gradient-recalled acquisition in the steady state; TR ⫽ repetition time; TE ⫽ echo time; ST ⫽ slice thickness; NEx ⫽ no. of excitations; FOV ⫽ field of view. volume in long bones was from the articular surface (or, if absent, its best estimated position) to a depth of 1 cm, while it was the whole bone in carpal bones. Bone edema. Bone edema was scored 0–3 according to the volume of edema compared with the assessed bone volume (each wrist bone scored separately), with 0 ⫽ no edema, 1 ⫽ 1–33% of bone edematous, 2 ⫽ 34–66% of bone edematous, and 3 ⫽ 67–100% of bone edematous. It should be emphasized that in the case of the concurrent presence of erosion and edema, edema was scored as a proportion of the estimated original bone volume, not of the remaining bone (21). Synovitis. Synovitis in the wrist was assessed in 3 regions (the distal radioulnar joint, the radiocarpal joint, and the intercarpal and carpometacarpophalangeal joints). A score of 0 represented normal (no synovitis), while scores of 1–3 (mild, moderate, and severe, respectively) reflected the tertiles of enhancing tissue in the synovial compartment relative to the presumed maximum volume. Statistical analysis. All statistical analyses were undertaken using SPSS for Windows, version 11 (SPSS, Chicago, IL). Intrareader and interreader reliabilities were evaluated using a two-way mixed effect model, and single-measure and averagemeasure intraclass correlation coefficients (ICCs) were calcu- lated for both status scores and change scores. The averagemeasure ICC was corrected for the number of readers and was calculated for the interreader reliability. ICC values are expressed as the median (range) for the intrareader reliability (due to the low number of values) and as the mean (95% confidence interval) for the interreader reliability. ICC values are comparable with kappa values; scores higher than 0.60 are considered good, and scores higher than 0.80 are considered very good. Sensitivity to change was assessed by calculating the smallest detectable difference (SDD), derived from the limits of agreement method described by Bland and Altman (22). The SDD represents the smallest change score that can be discriminated from the measurement error of the scoring method, and is expressed in the same units of measurement as calculated for the score. Using SDD as the threshold level for relevant progression of joint damage ensures that an observed change exceeds, with 95% confidence, the measurement error. The minimal detectable change (MDC) is a way to express the SDD as a percentage of the maximum score of the method, to allow comparisons with other radiographic and clinical measures. An SDD of 0 indicates perfect agreement, and there is no convention regarding any upper limit; however, an MDC of Table 2. Intrareader agreement of the Rheumatoid Arthritis Magnetic Resonance Imaging Scores, determined by a two-way mixed effect model (single measure)* Score, measure Baseline Intrareader ICC SDD MDC, % 1 year followup Intrareader ICC SDD MDC, % Change score Intrareader ICC SDD MDC, % Synovitis Bone erosion Bone marrow edema 0.89 (0.83–0.98) 1.73 (0.92–2.22) 19.2 (10.3–24.7) 0.91 (0.82–0.96) 4.98 (3.92–9.21) 3.33 (2.61–6.14) 0.90 (0.87–0.95) 2.73 (1.85–3.36) 6.06 (4.11–7.47) 0.90 (0.75–0.96) 1.89 (1.24–2.81) 19.8 (13.8–31.2) 0.90 (0.72–0.96) 5.53 (3.21–20.0) 3.69 (2.14–13.4) 0.98 (0.96–0.99) 3.18 (2.40–5.40) 7.07 (5.34–12.0) 0.80 (0.74–0.83) 2.39 (2.08–2.92) 26.5 (23.1–32.4) 0.96 (0.68–0.97) 2.24 (1.37–13.9) 1.49 (0.91–9.24) 0.97 (0.96–0.99) 3.68 (1.80–5.33) 8.17 (4.01–11.8) * Values are the median (range) intraclass correlation coefficient (ICC), smallest detectable difference (SDD), and minimal detectable change (MDC; defined as the SDD expressed as a percentage of the maximum score). RELIABILITY OF THE OMERACT RAMRIS SYSTEM 3863 Table 3. Interreader agreement of the Rheumatoid Arthritis Magnetic Resonance Imaging Scores, determined by a two-way mixed effect model (single and average measure)* Synovitis Score, measure Baseline SmICC AvmICC 1 year followup SmICC AvmICC Change score SmICC AvmICC Bone erosion Bone marrow edema 4 readers 3 readers† 4 readers 3 readers† 4 readers 3 readers† 0.69 (0.47–0.89) 0.95 (0.88–0.98) 0.77 (0.56–0.92) 0.95 (0.88–0.99) 0.83 (0.66–0.94) 0.97 (0.94–0.99) 0.85 (0.69–0.95) 0.97 (0.93–0.99) 0.79 (0.59–0.94) 0.97 (0.92–0.99) 0.80 (0.60–0.94) 0.96 (0.90–0.99) 0.78 (0.59–0.92) 0.97 (0.92–0.99) 0.82 (0.64–0.94) 0.96 (0.91–0.99) 0.73 (0.53–0.91) 0.96 (0.90–0.99) 0.82 (0.64–0.94) 0.96 (0.91–0.99) 0.95 (0.89–0.99) 0.99 (0.98–1.00) 0.95 (0.88–0.99) 0.99 (0.98–1.00) 0.74 (0.53–0.91) 0.96 (0.90–0.99) 0.78 (0.58–0.93) 0.96 (0.89–0.99) 0.67 (0.44–0.88) 0.94 (0.86–0.98) 0.80 (0.61–0.93) 0.96 (0.90–0.99) 0.95 (0.89–0.99) 0.99 (0.98–1.00) 0.95 (0.87–0.99) 0.99 (0.98–1.00) * Values are the mean (95% confidence interval). SmICC ⫽ single-measure intraclass correlation coefficient; AvmICC ⫽ average-measure intraclass correlation coefficient. † The least-experienced reader was not included in this analysis. lower than 20% is generally accepted to reflect a high potential to detect changes (18). RESULTS The intrareader single-measure ICC, the SDD, and the MDC for all components of the RAMRIS (both status and change scores) are presented in Table 2. Intrareader ICCs were generally very high for status scores (median baseline and followup ICCs 0.89 and 0.90 for synovitis, 0.91 and 0.90 for erosion, and 0.90 and 0.98 for edema) and for change scores (median ICC 0.80 for synovitis, 0.96 for erosion, and 0.97 for edema), and ranged up to 0.99 for individual readers. The ICCs were highest for scoring of bone marrow edema. The SDDs were generally low, with MDCs lower than 20% for all measures except the synovitis change score, which had an MDC of 26.5% (Table 2). Table 3 provides the interreader single-measure and average-measure ICCs for RAMRIS evaluations of status at baseline and 1-year followup, as well as change scores. To investigate the effect of reader experience separately, we also computed the ICCs for the 3 most experienced readers, omitting the reader who had no previous experience with the RAMRIS method. Interreader single-measure ICCs were generally high for status scores (mean baseline and followup ICCs 0.69 and 0.78 for synovitis, 0.83 and 0.73 for erosion, and 0.79 and 0.95 for bone marrow edema) and for change scores (mean ICC 0.74 for synovitis, 0.67 for erosion, and 0.95 for bone marrow edema). The average-measure ICCs were ⱖ0.94 for all components of the status scores and change scores (Table 3). In Table 4, the raw data on scoring of the wrist joints of all 10 patients by one of the readers are provided, showing the spectrum of the different aspects of Table 4. Rheumatoid Arthritis Magnetic Resonance Imaging Score results from one of the readers for the wrist joints of all 10 patients with rheumatoid arthritis (RA) at baseline and 12 months* Synovitis Disease status, patient Early RA 1 2 3 4 Established RA 5 6 7 8 9 10 Bone erosion Bone marrow edema Baseline 12 months Baseline 12 months Baseline 12 months 6/6 2/3 3/4 8/7 4/3 2/2 3/4 8/7 3/2 3/3 4/3 21/19 4/2 5/7 5/5 22/21 0/0 5/4 1/1 9/7 1/0 4/3 0/0 6/5 7/8 5/5 5/3 6/7 6/5 4/4 9/8 9/8 6/6 5/6 6/5 4/4 10/10 16/12 14/11 26/23 13/13 25/19 18/17 36/30 15/11 31/26 18/16 25/19 9/7 3/3 7/3 10/8 7/6 2/3 18/10 35/31 8/5 7/8 7/8 2/2 * Values are the scores on day 1/day 2. 3864 HAAVARDSHOLM ET AL Figure 1. A, Baseline coronal T1-weighted magnetic resonance images, showing erosions in the capitate and the base of the second metacarpal bone. B, Corresponding images at 12 months, showing progression of erosive changes in the capitate and the base of the second metacarpal bone, and development of a large erosion in the hamate. C, Baseline axial T1-weighted images pre– and post–intravenous contrast, showing a grade 2 synovitis in the distal radioulnar joint. D, Corresponding 12-month images, showing grade 3 synovitis. All images are from patient 6 in Table 4. the RAMRIS in the 2 patient groups studied (early RA versus established RA) (see also Figure 1). DISCUSSION Sufficient reproducibility is a prerequisite feature for any scoring method to be considered of clinical value. This study demonstrates that all aspects (synovitis, bone marrow edema, and bone erosion) of the OMERACT RAMRIS system exhibit very good intrareader reliability and good interreader reliability for assessment of status as well as scoring of change, when carried out by trained, calibrated readers. RELIABILITY OF THE OMERACT RAMRIS SYSTEM Table 5. Interreader correlation coefficients in previous studies compared with the present study (single measure) Authors (ref.) Synovitis Bone erosion Bone marrow edema Østergaard et al (19) Lassere et al (18) Conaghan et al (20) Baseline Followup Change Present study Baseline Followup Change 0.58 0.74 0.65* 0.72 Not available 0.78 0.74 0.68 0.46 0.15 0.45 0.55 0.08 0.56 0.45 0.69 0.78 0.74 0.83 0.73 0.67 0.79 0.95 0.95 * Global bone score for erosions (range 0–3) as opposed to the present study’s Rheumatoid Arthritis Magnetic Resonance Imaging Score for erosions (range 0–10). The RAMRIS has previously shown acceptable intra- and interreader reliability for measures of disease activity and damage in cross-sectional studies (18– 20,23), whereas one study of the reliability of the change score demonstrated only fair to moderate levels of reliability (20). Table 5 presents an overview of the results from previous studies as well as from the present study with regard to the reliability of the RAMRIS. In general, the degree of agreement was numerically higher in the current study for all components, both for the status scores and for the change scores (Table 5). The higher degree of agreement of the RAMRIS results in the present study compared with that in earlier studies may be explained by several factors. In earlier studies the readers had undergone limited formal training exercises and were not calibrated. In this study the readers met for 1 day prior to the study, to review scoring methods and for initial calibration. All 4 readers either had previous experience with the RAMRIS system or were familiar with reading MR images of the hands and wrists. We wanted to explore the importance of training, and analyzed the data after excluding the scorer who had no previous experience with using the RAMRIS. Omitting the least-experienced reader from our analyses resulted in higher interreader ICCs (Table 3), except for the bone marrow edema score (virtually unchanged), suggesting that consistency of scoring may improve with experience. Caution has to be applied when comparing reliability results across studies, because the results depend on the data sets that are used for analyses. In the study by the OMERACT MRI study group (20), the spectrum of disease abnormalities was narrow (i.e., from an early 3865 RA cohort), which lowered the ICCs obtained. The patients in the current study were selected from 2 different cohorts (early and established RA) to better reflect a broad spectrum of the disease. All patients showed progression on conventional hand radiographs at 12 months compared with baseline. Thus, progression of the MRI erosion score was expected, although the hand radiographs covered a larger anatomic area (finger and wrist joints bilaterally, in contrast to only the dominant wrist on MRI). In previous studies, hard copies of MR images have been used, whereas in this study digitalized images were read on large-screen monitors with advanced imaging software, making it easier to detect subtle changes. Due to feasibility issues, the number of patients was limited to 10 because this was the maximum number that was possible to score during one day. In the present study the images were read paired and in known order. Van der Heijde et al (24) found that this method (i.e., reading films in chronologic order) is the most sensitive to change. Although this is true for conventional radiographs, it may not necessarily be the case for MRI; therefore, the notion of chronologic order should ideally be formally validated in a separate study. Lassere et al (25) found that common clinical measures of RA, such as tender and swollen joint counts, pain, and patient’s global assessment of health, all had poor reliability and large SDDs compared with radiographic measures. In contrast, the levels of reliability and the sensitivity to change of the RAMRIS obtained in this study are comparable with those published for radiographic erosion scoring methods (1,25–28). In the clinical trial setting, MRI potentially has many advantages over conventional radiography in measuring responses to therapeutic agents. Whereas radiography only visualizes the late signs of preceding disease activity, MRI is a multiplanar technique that can detect RA erosive changes with greater sensitivity than that of conventional radiography (6), particularly in early disease. In addition, MRI allows direct visualization and assessment of synovitis, the primary lesion in RA, and of bone edema, a probable forerunner of bone erosions. MDCs lower than 20% for all measures except the synovitis change score in this study suggest a high potential of the RAMRIS to detect longitudinal structural changes. The MDC of 26.5% detected for the synovitis change score implies that this aspect of the RAMRIS may not be as sensitive to change as the erosion score and bone marrow edema score. However, this study was mainly designed to detect longitudinal structural changes, since we only included patients who 3866 HAAVARDSHOLM ET AL displayed progression on conventional radiographs. A lower MDC for the synovitis change score may be expected in an intervention study in which patients receive medication targeted at suppressing inflammation, such as the new biologic regimens. All patients in this cohort received conventional disease-modifying antirheumatic therapy. The OMERACT RAMRIS is a semiquantitative scoring method and not a true quantitative system. Direct measurement of erosion size and the extent of the synovial membrane have been proposed as alternative methods of quantifying damage. Bird et al (29) found that the interreader ICC was similar for computerized erosion volume measurements and the RAMRIS results, but that there were large systematic differences in volumes between readers. These quantitative measures may, in the future, prove to be more responsive to change than the RAMRIS, but the inter- and intrarater reliability and responsiveness to change need to be validated in longitudinal studies. Recently, a European League Against Rheumatism atlas of OMERACT reference images from MRI of RA joints has been developed (30), which provides readers with a new tool for standardized assessment of RA joints, making it possible to score sets of MR images for inflammatory and destructive changes according to the best possible match with standard reference images, similar to the Larsen method for scoring of radiographs (31). This approach is expected to further increase the opportunities for standardized reproducible scoring using the OMERACT RAMRIS system. With access to state-of-the-art technical equipment to read MR images in a digital environment, we found that the RAMRIS showed very good intrareader reliability, good interreader reliability, and a high level of sensitivity to change in evaluating lesions both at a single time point and in a longitudinal setting (represented by change scores). These findings suggest that the OMERACT RAMRIS system may be suitable for use in clinical practice, randomized controlled trials, and longitudinal observational studies. ACKNOWLEDGMENTS We thank research nurse Margareth Sveinsson for collecting clinical data, research coordinator Tone Omreng for organizing the data collection, technician Marianne Ytrelid for technical assistance, and Petter Mowinckel, MSc, for statistical advice. REFERENCES 1. Van der Heijde DM. Plain X-rays in rheumatoid arthritis: overview of scoring methods, their reliability and applicability. Baillieres Clin Rheumatol 1996;10:435–53. 2. Sharp JT. Assessment of radiographic abnormalities in rheumatoid arthritis: what have we accomplished and where should we go from here? J Rheumatol 1995;22:1787–91. 3. Jorgensen C, Cyteval C, Anaya JM, Baron MP, Lamarque JL, Sany J. Sensitivity of magnetic resonance imaging of the wrist in very early rheumatoid arthritis. Clin Exp Rheumatol 1993;11:163–8. 4. Sharp JT, Wolfe F, Mitchell DM, Bloch DA. The progression of erosion and joint space narrowing scores in rheumatoid arthritis during the first twenty-five years of disease. Arthritis Rheum 1991;34:660–8. 5. Wolfe F, Sharp JT. Radiographic outcome of recent-onset rheumatoid arthritis: a 19-year study of radiographic progression. Arthritis Rheum 1998;41:1571–82. 6. Klarlund M, Ostergaard M, Jensen KE, Madsen JL, Skjodt H, Lorenzen I, and the TIRA Group. Magnetic resonance imaging, radiography, and scintigraphy of the finger joints: one year follow up of patients with early arthritis. Ann Rheum Dis 2000;59:521–8. 7. Ostergaard M, Hansen M, Stoltenberg M, Jensen KE, Szkudlarek M, Pedersen-Zbinden B, et al. New radiographic bone erosions in the wrists of patients with rheumatoid arthritis are detectable with magnetic resonance imaging a median of two years earlier. Arthritis Rheum 2003;48:2128–31. 8. McQueen FM, Benton N, Perry D, Crabbe J, Robinson E, Yeoman S, et al. Bone edema scored on magnetic resonance imaging scans of the dominant carpus at presentation predicts radiographic joint damage of the hands and feet six years later in patients with rheumatoid arthritis. Arthritis Rheum 2003;48: 1814–27. 9. Savnik A, Malmskov H, Thomsen HS, Graff LB, Nielsen H, Danneskiold-Samsoe B, et al. MRI of the wrist and finger joints in inflammatory joint diseases at 1-year interval: MRI features to predict bone erosions. Eur Radiol 2002;12:1203–10. 10. Conaghan PG, O’Connor P, McGonagle D, Astin P, Wakefield RJ, Gibbon WW, et al. Elucidation of the relationship between synovitis and bone damage: a randomized magnetic resonance imaging study of individual joints in patients with early rheumatoid arthritis. Arthritis Rheum 2003;48:64–71. 11. Benton N, Stewart N, Crabbe J, Robinson E, Yeoman S, McQueen FM. MRI of the wrist in early rheumatoid arthritis can be used to predict functional outcome at 6 years. Ann Rheum Dis 2004;63:555–61. 12. Backhaus M, Kamradt T, Sandrock D, Loreck D, Fritz J, Wolf KJ et al. Arthritis of the finger joints: a comprehensive approach comparing conventional radiography, scintigraphy, ultrasound, and contrast-enhanced magnetic resonance imaging. Arthritis Rheum 1999;42:1232–45. 13. Lindegaard H, Vallo J, Horslev-Petersen K, Junker P, Ostergaard M. Low field dedicated magnetic resonance imaging in untreated rheumatoid arthritis of recent onset. Ann Rheum Dis 2001;60: 770–6. 14. McQueen FM, Stewart N, Crabbe J, Robinson E, Yeoman S, Tan PL, et al. Magnetic resonance imaging of the wrist in early rheumatoid arthritis reveals a high prevalence of erosions at four months after symptom onset. Ann Rheum Dis 1998;57:350–6. 15. McQueen FM, Stewart N, Crabbe J, Robinson E, Yeoman S, Tan PL, et al. Magnetic resonance imaging of the wrist in early rheumatoid arthritis reveals progression of erosions despite clinical improvement. Ann Rheum Dis 1999;58:156–63. 16. Ostergaard M, Peterfy C, Conaghan P, McQueen F, Bird P, Ejbjerg B, et al. OMERACT rheumatoid arthritis magnetic resonance imaging studies: core set of MRI acquisitions, joint pathol- RELIABILITY OF THE OMERACT RAMRIS SYSTEM 17. 18. 19. 20. 21. 22. 23. 24. ogy definitions, and the OMERACT RA-MRI scoring system. J Rheumatol 2003;30:1385–6. McQueen F, Lassere M, Edmonds J, Conaghan P, Peterfy C, Bird P, et al. OMERACT rheumatoid arthritis magnetic resonance imaging studies: summary of OMERACT 6 MR imaging module. J Rheumatol 2003;30:1387–92. Lassere M, McQueen F, Ostergaard M, Conaghan P, Shnier R, Peterfy C, et al. OMERACT rheumatoid arthritis magnetic resonance imaging studies: exercise 3: an international multicenter reliability study using the RA-MRI score. J Rheumatol 2003;30: 1366–75. Ostergaard M, Klarlund M, Lassere M, Conaghan P, Peterfy C, McQueen F, et al. Interreader agreement in the assessment of magnetic resonance images of rheumatoid arthritis wrist and finger joints: an international multicenter study. J Rheumatol 2001;28:1143–50. Conaghan P, Lassere M, Ostergaard M, Peterfy C, McQueen F, O’Connor P, et al. OMERACT rheumatoid arthritis magnetic resonance imaging studies: exercise 4: an international multicenter longitudinal study using the RA-MRI Score. J Rheumatol 2003; 30:1376–9. Ostergaard M, Edmonds J, McQueen F, Peterfy C, Lassere M, Ejbjerg B, et al. An introduction to the EULAR-OMERACT rheumatoid arthritis MRI reference image atlas. Ann Rheum Dis 2005;64 Suppl 1:i3–7. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986;1:307–10. Bird P, Ejbjerg B, McQueen F, Ostergaard M, Lassere M, Edmonds J. OMERACT rheumatoid arthritis magnetic resonance imaging studies: exercise 5: an international multicenter reliability study using computerized MRI erosion volume measurements. J Rheumatol 2003;30:1380–4. Van der Heijde DM, Boonen A, Boers M, Kostense P, van der Linden S. Reading radiographs in chronological order, in pairs or 3867 25. 26. 27. 28. 29. 30. 31. as single films has important implications for the discriminative power of rheumatoid arthritis clinical trials. Rheumatology (Oxford) 1999;38:1213–20. Lassere MN, van der Heijde DM, Johnson KR, Boers M, Edmonds J. Reliability of measures of disease activity and disease damage in rheumatoid arthritis: implications for smallest detectable difference, minimal clinically important difference, and analysis of treatment effects in randomized controlled trials. J Rheumatol 2001;28:892–903. Boini S, Guillemin F. Radiographic scoring methods as outcome measures in rheumatoid arthritis: properties and advantages. Ann Rheum Dis 2001;60:817–27. Van der Heijde DM, Dankert T, Nieman F, Rau R, Boers M. Reliability and sensitivity to change of a simplification of the Sharp/van der Heijde radiological assessment in rheumatoid arthritis. Rheumatology (Oxford) 1999;38:941–7. Bruynesteyn K, van der Heijde D, Boers M, Saudan A, Peloso P, Paulus H, et al. Determination of the minimal clinically important difference in rheumatoid arthritis joint damage of the Sharp/van der Heijde and Larsen/Scott scoring methods by clinical experts and comparison with the smallest detectable difference. Arthritis Rheum 2002;46:913–20. Bird P, Lassere M, Shnier R, Edmonds J. Computerized measurement of magnetic resonance imaging erosion volumes in patients with rheumatoid arthritis: a comparison with existing magnetic resonance imaging scoring systems and standard clinical outcome measures. Arthritis Rheum 2003;48:614–24. Ejbjerg B, McQueen F, Lassere M, Haavardsholm E, Conaghan P, O’Connor P, et al. The EULAR-OMERACT rheumatoid arthritis MRI reference image atlas: the wrist joint. Ann Rheum Dis 2005;64 Suppl 1:i23–47. Larsen A, Dale K, Eek M. Radiographic evaluation of rheumatoid arthritis and related conditions by standard reference films. Acta Radiol Diagn (Stockh) 1977;18:481–91.