close

Вход

Забыли?

вход по аккаунту

?

Validation of the Spondyloarthritis Research Consortium of Canada magnetic resonance imaging spinal inflammation indexIs it necessary to score the entire spine.

код для вставкиСкачать
Arthritis & Rheumatism (Arthritis Care & Research)
Vol. 57, No. 3, April 15, 2007, pp 501–507
DOI 10.1002/art.22627
© 2007, American College of Rheumatology
ORIGINAL ARTICLE
Validation of the Spondyloarthritis Research
Consortium of Canada Magnetic Resonance
Imaging Spinal Inflammation Index: Is It
Necessary to Score the Entire Spine?
WALTER P. MAKSYMOWYCH,1 SUHKVINDER S. DHILLON,1 ROY PARK,1 DAVID SALONEN,2
ROBERT D. INMAN,2 AND ROBERT G. W. LAMBERT1
Objective. The Spondyloarthritis Research Consortium of Canada (SPARCC) magnetic resonance imaging (MRI) spinal
inflammation index has been developed to objectively measure inflammation in ankylosing spondylitis (AS) and to assess
change in response to therapeutic intervention. Scoring of the entire spine limits feasibility and a scoring method that
records inflammation in only the more severely affected spinal segments may improve feasibility without sacrificing
performance.
Methods. MRI films of 68 patients with AS were assessed in random order by 2 blinded readers. Interreader reliability
was assessed by intraclass correlation coefficient. Pre- and posttreatment MRI films of 29 patients randomized to placebo
or anti–tumor necrosis factor ␣ (anti-TNF␣) therapy were read by readers blinded to chronology, and responsiveness was
assessed by effect size and standardized response mean. The performance of scores based on 6, 8, 10, and all 23 spinal
discovertebral units (DVU) was compared.
Results. The median number of affected spinal levels per patient was 6.0 and 62% of all affected levels were included
when analysis was limited to only the 6 most severely affected levels per patient. Comparison of DVU scores that were
limited to only the more severely affected DVU (6-, 8-, 10-DVU score) with scores for all 23 spinal DVU showed excellent
interreader reliability for status and change scores (Spearman’s correlation >0.90) as well as similar construct validity.
Responsiveness to anti-TNF␣ therapy was greater when the more limited scoring methods were used and was greatest
with the 6-DVU score.
Conclusion. The SPARCC MRI spinal inflammation index performs better when analysis is limited to a maximum of 6
most severely affected levels compared with assessment of the entire spine. This should improve its feasibility in clinical
trials and research.
KEY WORDS. Magnetic resonance imaging; Ankylosing spondylitis; SPARCC method; Validation.
INTRODUCTION
Magnetic resonance imaging (MRI) is the most sensitive
imaging modality for detection of inflammatory lesions in
the spine and sacroiliac joints of patients with ankylosing
spondylitis (AS) (1). This has been made possible through
the use of MRI sequences, such as STIR, that suppress the
Dr. Maksymowych is a Senior Scholar of the Alberta
Heritage Foundation for Medical Research.
1
Walter P. Maksymowych, FRCP(C), Suhkvinder S. Dhillon, FRCR(C), Roy Park, FRCP(C), Robert G. W. Lambert,
FRCP: University of Alberta, Edmonton, Alberta, Canada;
2
David Salonen, FRCP(C), Robert D. Inman, FRCP(C): University Health Network, University of Toronto, Toronto,
Ontario, Canada.
signal from marrow fat. Elimination of fat signal on T2weighted sequences promotes the visualization of abnormal increased water content due to the underlying bone
marrow edema that is associated with inflammation. Typical appearances in the spine include increased T2 signal
at the anterior corners of the vertebrae, reflecting inflammation at the attachment of the annulus fibrosus to the
vertebral corner, and increased signal in the subchondral
Address correspondence to Walter P. Maksymowych,
FRCP(C), 562 Heritage Medical Research Building, University of Alberta, Edmonton, Alberta, Canada, T6G 2S2.
E-mail: walter.maksymowych@ualberta.ca.
Submitted for publication February 8, 2006; accepted in
revised form June 23, 2006.
501
502
bone adjacent to the vertebral end plate (2). Furthermore, it
has been shown that these lesions resolve following the
institution of anti–tumor necrosis factor ␣ (anti-TNF␣)
therapies and it has therefore been suggested that MRI can
be used to assess the efficacy of treatment, particularly
because clinical outcome measures are largely based on
patient self-reported questionnaires (3). Accordingly, scoring systems have been developed to facilitate the evaluation of inflammatory lesions observed on MRI (3,4). However, the optimal approach to the scoring of MRI lesions
currently lacks consensus and is presently the subject of
further evaluation by investigators using the Outcome
Measures in Rheumatology Clinical Trials (OMERACT)
approach to the validation of outcome instruments in musculoskeletal disorders (5). In particular, OMERACT has
proposed that newly developed instruments meet the criteria of feasibility, truth, and discrimination. The latter is
a function of both reproducibility and responsiveness to
change.
Two methods have been reported for scoring inflammatory lesions in the spine (3,4). Both rely on the assessment
of the signal on fat-suppressed images (STIR, T2-weighted
fat saturation) in the anterior segment of the spine (vertebral body) and do not score lesions in the posterior elements of the spine. Both methods also use the discovertebral unit (DVU) as the primary anatomic region for scoring
inflammation. The DVU is defined as the region between 2
imaginary lines drawn through the middle of adjacent
vertebrae and including adjacent vertebral end plates with
the intervening disc. The Spondyloarthritis Research
Consortium of Canada (SPARCC) MRI spinal inflammation index takes advantage of the ability of MRI to visualize lesions in several dimensions (4). The developers
of this method have proposed that scoring be limited to
a maximum of 6 of the most severely affected levels on
the basis that the mean number of affected DVU per
patient in a prior study was 3.2 (95% confidence interval
1.2–5.2).
Limiting the assessment to only the most severely affected levels improves feasibility in that the time necessary for evaluation is less than for the entire spine. Although this approach may introduce measurement error
due to readers differing in their selection of levels for
scoring, the alternative, assessment of the entire spine, is
subject to significant problems. Being forced to score the
entire spine results in the inclusion of less discernable
lesions, which may reduce sensitivity to change, and
forces the reader to score levels that are affected by signal
artifact. This is not a trivial issue as some degree of phaseencoding artifact occurs in almost every case when scanning the entire spine with large fields of view. Consequently, it is not clear how many levels should be assessed
to maximize sensitivity to change without compromising
interobserver reproducibility. In this study we compared
the performance of the SPARCC scoring method according
to the OMERACT filter for all 23 spinal levels with a
scoring scheme that is limited to only the most severely
affected DVU. Our objective was to determine how many
levels should be analyzed for optimal feasibility and discrimination.
Maksymowych et al
PATIENTS AND METHODS
Patients and study protocol. We studied 2 cohorts of
patients with AS as defined by the modified New York
criteria (6). Cohort A was a cross-sectional cohort of 39
patients with AS (29 men, mean age 42.3 years [range
22– 68 years], mean disease duration 13.4 years [range
2– 41 years], mean Bath Ankylosing Spondylitis Disease
Activity Index [BASDAI] score of 5.5 [range 3.0 – 8.6]) who
attended the outpatient clinic in the Rheumatic Disease
Unit at the University of Alberta. All patients had been
recruited to a prospective, longitudinal observational cohort, the Follow up Research Cohort of AS study
(FORCAST), in which clinical and laboratory data are
systematically collected every 6 months and plain radiographic imaging and MRI are obtained annually. Most
patients (83%) receive nonsteroidal antiinflammatory
drugs and/or physical therapy.
Cohort B comprised 29 patients who had severe, active
disease as defined by a BASDAI score ⱖ4 and who had
been randomized to receive either an anti-TNF␣ agent or
placebo in a 24-week double-blind placebo-controlled trial
of either adalimumab (n ⫽ 11; 1:1 randomization, adalimumab administered in a dose of 40 mg subcutaneously
on alternate weeks) or infliximab (n ⫽ 18; 3:8 randomization of placebo:infliximab, infliximab administered in a
dose of 5 mg at 0, 2, and 6 weeks and every 6 weeks
thereafter). Nineteen patients in cohort B were recruited at
the University of Alberta and comprised 14 men and 5
women (mean age 43.4 years [range 33– 65 years], mean
disease duration 18.7 years [range 9 – 42 years]). Nine patients in cohort B were recruited at the University of Toronto and comprised 8 men and 1 woman (mean age 40.2
years, mean disease duration 16.1 years). The mean
BASDAI score for the entire group of 29 patients was 6.1.
Pre- and posttreatment MRI films from the 18 patients that
were recruited to the infliximab trial had been scored 18
months prior to the current exercise by 1 (SSD) of the 2
readers (4).
Cohort A underwent MRI at a single time point whereas
cohort B underwent MRI at baseline and either 12 weeks
(adalimumab trial) or 24 weeks (infliximab trial) after randomization. We also included 6 controls with nonspecific
back pain who underwent MRI at a single time point. The
study was approved by the ethics committees of the University of Alberta and the University Health Network (Toronto).
Magnetic resonance imaging. MRI of the spine was performed with 1.5T Siemens (Munich, Germany) or GE systems (Waukesha, WI) using appropriate surface coils.
Sagittal sequences were obtained with 3– 4-mm slice thickness and 11–15 slices acquired. Sequence parameters were
as follows: T1-weighted spin echo (time to recovery [TR]
517– 618 msec, time to echo [TE] 13 msec) and STIR (TR
2,720 –3,170 msec, time to inversion 140 msec, TE 38 – 61
msec). The spine was imaged in 2 parts: upper half comprising the entire cervical and most of the thoracic spine,
lower half comprising the lower portion of the thoracic
spine and entire lumbar spine. The specific MRI parame-
Validation of the SPARCC MRI Spinal Index Score in AS
ters for acquiring spine images are provided on our Web
site (available at: www.arthritisdoctor.ca).
Scoring of MRI lesions. Scoring of MRI lesions has been
described previously (4). Briefly, our scoring method for
active inflammatory lesions in the spine relies on the use
of the STIR sequence that suppresses the normal marrow
fat signal, the presence of which frequently obscures signal
emanating from bone marrow edema associated with inflammation. T1-weighted spin-echo images were included
for anatomic reference only and were not scored. For each
DVU, 3 consecutive sagittal slices were scored, which
allowed evaluation of the coronal extent of lesions as well
as assessment in the sagittal and anteroposterior planes.
Discal lesions were not scored.
Definition of abnormal STIR signal. Bone marrow signal
in the center of the vertebra or an adjacent normal vertebra
constituted the reference for designation of normal signal.
A set of reference AS cases were included to facilitate
designation of abnormal signal on STIR.
Scoring of depth and intensity. Signal from cerebrospinal fluid constituted the reference for designating an inflammatory lesion as intense. A lesion was graded as deep
if there was a homogeneous and unequivocal increase in
signal over at least 1 cm from the vertebral end plate.
Assessment of depth was made possible by including a
scale on the image.
Scoring method. Each DVU was divided into 4 quadrants: upper anterior, upper posterior, lower anterior, and
lower posterior. The presence of increased T2 signal in
each of these 4 quadrants was scored on a dichotomous
basis (1 ⫽ increased signal, 0 ⫽ normal signal). This was
repeated for each of 3 consecutive sagittal slices giving a
maximum score of 12 per DVU. On each slice, the presence
of a lesion exhibiting intense signal in any quadrant was
given an additional score of 1. Similarly, the presence of a
lesion exhibiting a depth ⱖ1 cm in any quadrant was given
an additional score of 1, resulting in a maximum additional score of 6 for that level and bringing the total score
to 18 per DVU.
MRI reading exercises. A unique MRI study number
was allocated to each patient and control, thereby ensuring
blinding to all patient demographics. Allocation was done
by a technologist unconnected with the study. Assessment
was performed on a 3-monitor review station by 2 readers
using computer software that has been optimized for this
type of review (Merge efilm, Milwaukee, WI). Each patient
was only identified by the MRI study number and films
were read in random order. Pre- and posttreatment images
were scored concurrently with the reader blinded to time
sequence. No instructions were provided as to how the
reader should select the most severely affected DVU for the
6-, 8-, and 10-DVU scores and scoring was done from C2 to
L5 in all cases. The 3-monitor review station readily permits simultaneous visualization of all segments of the
spine on pre- and posttreatment images. Readers, trained
in use of the SPARCC system, were instructed to identify
the 6, 8, and 10 worst levels based on the scans from both
503
time points and no other specific guidance was necessary
as to how the reader should select the most severely affected DVU. The same DVU were scored on pre- and posttreatment images in the assessment of the 6, 8, and 10 most
severely affected DVU.
Statistical analysis. Descriptive statistics (mean, median, interquartile range, standard deviation, maximum
and minimum values) were used to describe the overall
distribution of scores. Distribution of affected levels and
DVU scores for the entire spine and according to spinal
segment was based on the mean scores of the 2 readers.
The interobserver reproducibility of status and change
scores was calculated using analysis of variance to provide
an intraclass correlation coefficient (ICC). A two-way
mixed effects model with observer as a fixed factor was
used. Values ⬎0.6 represented good reproducibility, ⬎0.8
represented very good reproducibility, and ⬎0.9 represented excellent reproducibility. Reproducibility was also
examined using Bland-Altman plots and 95% limits of
agreement. Construct validity was assessed by comparing
changes in the index score with changes in disease activity
as quantified by the BASDAI (7), nocturnal back pain, and
C-reactive protein (CRP) levels. This was done using
Spearman’s correlation coefficient analysis. Two statistical
methods were used to assess responsiveness: the effect
size and the standardized response mean. Values of 0.20,
0.50, and ⱖ0.80 were considered to represent small, moderate, and large degrees of responsiveness, respectively.
Discrimination was not assessed because the open-label
phase of the clinical trial is still ongoing and treatment
codes remain unbroken at this time.
RESULTS
Descriptive data. The mean number of affected levels
for the entire spine was 6.9 (median 6.0) and the majority
of affected levels were in the thoracic spine (mean 4.2,
median 4.0) (Table 1). The highest DVU scores were also
recorded in the thoracic spine and 65.2% of all affected
levels were located in this region. Only 15.0% and 19.8%
of affected levels were located in the cervical and lumbar
spines, respectively. The percentages of patients that were
assessed by both observers as having no affected level in
the cervical, thoracic, and lumbar spines were 33.8% (23
of 68), 13.2% (9 of 68), and 26.5% (18 of 68), respectively.
Percentages of patients assessed as having no affected level
by at least 1 observer in the cervical, thoracic, and lumbar
spines were 57.4% (39 of 68), 29.4% (20 of 68), and 50%
(34 of 68), respectively.
Median scores for the 6-, 8-, 10-, and 23-DVU scores
were similar. Approximately half of the patients (51.5%)
had ⱕ6 affected levels and the percentages of patients that
were assessed as having more than 6, 8, and 10 affected
levels were 48.5% (33 of 68), 41.2% (28 of 68), and 26.5%
(18 of 68), respectively. Of the 473 affected levels, 292
(61.7%) levels were scored when analysis was limited to
only the 6 most severely affected DVU per patient. When
scoring was limited to only the 8 and 10 most severely
affected levels, the number of analyzed DVU increased to
504
Maksymowych et al
Table 1. Descriptive statistics for numbers of affected
DVU and DVU scores per patient according to region of
spine examined and by the number of affected DVU
scored in 68 patients with ankylosing spondylitis*
Parameter
Mean ⴞ SD
Median
(IQR)
Table 2. Interobserver reliability of status scores in 68
patients with ankylosing spondylitis and change scores
in 29 patients who received anti–tumor necrosis
factor therapy*
Range
Parameter
No. of affected DVU
Total
Cervical spine
Thoracic spine
Lumbar spine
DVU score
Total (23 DVU)
Cervical spine
Thoracic spine
Lumbar spine
10-DVU score
8-DVU score
6-DVU score
6.9 ⫾ 5.5
1.1 ⫾ 1.3
4.2 ⫾ 3.7
1.5 ⫾ 1.6
29.3 ⫾ 32.5
4.4 ⫾ 7.7
19.1 ⫾ 21.9
5.8 ⫾ 8.4
26.6 ⫾ 27.9
24.6 ⫾ 24.8
21.6 ⫾ 20.4
6.0 (2–11)
1.0 (0–2)
4.0 (1–7)
1.0 (0–3)
20 (3–43)
1 (0–6)
11 (1–32)
2 (0–8)
19 (3–41)
18 (3–37)
18 (3–32)
0–19
0–5
0–12
0–5
0–175
0–57
0–94
0–41
0–135
0–112
0–86
* DVU ⫽ discovertebral unit; IQR ⫽ interquartile range.
352 (74.4%) and 409 (86.5%), respectively (Figure 1). The
sum total DVU score for all 68 patients with AS was 1,992.
Analysis that was limited to only 6 of the most severely
affected levels captured 73.7% of the total DVU score,
whereas analysis that was limited to 8 and 10 levels captured 84% and 90.8% of the total DVU score, respectively.
Mean scores for controls were 4.4, 4.5, 4.5, and 4.5 for the
6-DVU, 8-DVU, 10-DVU, and 23-DVU scores, respectively
(data not shown).
Reliability. The mean percentage agreement for selection of the 6 most severely affected DVU was 67.6% (range
33.4 –100%). Interobserver reliability for status scores was
good to very good for detection of affected levels and
excellent for scoring of affected levels in the thoracic and
lumbar spines (Table 2). Reliability was only moderate for
scoring affected levels in the cervical spine. Reliability of
both status and change scores was excellent regardless of
whether all or only a limited number of levels were analyzed. Bland-Altman plots showed that measurement dif-
Figure 1. Percentages of all affected discovertebral units (DVU;
n ⫽ 473) and total DVU score (n ⫽ 1,992) recorded in 68 patients
with ankylosing spondylitis when scoring was limited to a maximum of 6, 8, or 10 of the most severely affected DVU per patient.
Shaded bar ⫽ affected DVU; solid bar ⫽ DVU score.
Affected DVU
Total
Cervical spine
Thoracic spine
Lumbar spine
DVU score
Total (23 DVU)
Cervical spine
Thoracic spine
Lumbar spine
10-DVU score
8-DVU score
6-DVU score
Interobserver
ICC status
(n ⴝ 68)
Interobserver
ICC change
(n ⴝ 29)
0.89
0.77
0.83
0.78
0.81
0.54
0.82
0.63
0.93
0.70
0.94
0.90
0.95
0.95
0.95
0.91
0.62
0.92
0.89
0.93
0.93
0.92
* ICC ⫽ intraclass correlation coefficient; DVU ⫽ discovertebral
unit.
ferences between the 2 observers were evident across the
entire range of scores (Figure 2). This was similarly noted
for the 6-, 8-, and 10-DVU scores (data not shown).
Construct validity. Significant and similar correlations
were noted between changes in 6-, 8-, 10-, and 23-DVU
scores and changes in CRP level in the 29 patients who
received anti-TNF therapies (Table 3). No significant correlations were observed between changes in either nocturnal pain or BASDAI score and any DVU score.
Responsiveness. Analysis of changes in response to
anti-TNF therapy demonstrated that this was most readily
apparent in the thoracic spine (Table 4). Responsiveness
was minimal following assessment of the cervical spine. A
more limited scoring system was more responsive than
assessment of all 23 levels. Moreover, a scoring system that
Figure 2. Bland-Altman plot illustrating the difference in 23discovertebral unit scores between 2 observers (y-axis) in relation
to the mean scores (x-axis). Horizontal lines represent the 95%
limits of agreement.
Validation of the SPARCC MRI Spinal Index Score in AS
Table 3. Spearman’s correlations between changes in
clinical parameters and changes in Spondyloarthritis
Research Consortium of Canada magnetic resonance
imaging spinal DVU scores for 6, 8, 10, and all 23 spinal
DVU in 29 patients with ankylosing spondylitis
following treatment with anti–tumor necrosis
factor therapy*
⌬23 DVU ⌬10 DVU ⌬8 DVU ⌬6 DVU
⌬ Nocturnal pain
⌬ BASDAI
⌬ CRP level
0.26
0.36
0.68†
0.26
0.36
0.66†
0.27
0.33
0.66†
0.26
0.34
0.65†
* DVU ⫽ discovertebral unit; BASDAI ⫽ Bath Ankylosing Spondylitis Disease Activity Index; CRP ⫽ C-reactive protein.
† P ⬍ 0.0001.
was limited to a maximum of 6 most severely affected
levels demonstrated the greatest degree of responsiveness.
DISCUSSION
Our analyses of the SPARCC scoring method for the assessment of spinal inflammation by MRI demonstrated
that limiting scoring to only the 6 most severely affected
levels captures 62% of all affected DVU and 74% of the
total DVU score. Furthermore, interobserver reliability was
excellent regardless of whether analysis was limited to
only the most severely affected levels or included the
entire spine whereas responsiveness was optimal when
scoring was limited to only the 6 most severely affected
levels. These observations, together with improved feasibility, support the notion that during assessment of the
entire spine in patients with AS, scoring all affected DVU
is unnecessary and may therefore facilitate acceptance of
this approach for clinical research and in clinical trials.
These findings are not entirely surprising. Scoring of the
entire spine, as opposed to only the more severely affected
DVU, will include more subtle lesions that may be less
responsive to change and more difficult to assess. If read-
505
ers are permitted to select levels for scoring, some error in
reading due to the presence of signal artifact may be eliminated because the reader has the choice of not selecting
those levels that are clearly subject to phase-encoding,
partial-volume, or other artifacts. In addition, reliability of
assessment is not as good in the cervical spine and responsiveness to change in this region is poor. This finding
likely reflects both a relative lack of involvement and the
large field of view that is required to image the entire spine
in 2 halves. In the lumbar spine, reader reliability in selection of affected DVU and reliability of change scores
were also only moderate. The majority of affected levels
and the greatest contribution to the total DVU score came
from the thoracic spine. Accordingly, interreader reliability for status and change scores was maximal in this spinal
segment. It is premature to conclude, however, that scoring should be confined to the thoracic spine because the
distribution of spinal inflammatory lesions may vary according to disease duration and other demographic variables such as sex. Although stratification of our data according to disease duration and sex did not significantly
influence the distribution of affected DVU and DVU scores
in our cohorts (data not shown), this issue will require
further study in larger data sets. This scoring method cannot be recommended for diagnostic evaluation at this time.
Its primary purpose is to record change in inflammatory
lesions for clinical and therapeutic trials research and no
method for scoring MRI scans in clinical practice has yet
shown consistent results.
A potential source of bias, which may primarily affect
the reliability of the 23-DVU score, is introduced if the
reader selects the 6 most severely affected DVU before the
remaining DVU are scored. This may potentially reduce
the variability in identification and scoring of the remaining less severely affected DVU, leading to higher ICC values for the 23-DVU score. In fact, readers were not provided with any instructions as to when the most severely
affected DVU should be selected and they may, for instance, have scored all 23 DVU first and then chosen the 6
Table 4. Changes in the number of affected DVU and DVU scores in 29 patients with
ankylosing spondylitis following treatment with anti–tumor necrosis factor therapy*
Mean ⴞ SD score
Parameter
Affected DVU
Total
Cervical spine
Thoracic spine
Lumbar spine
DVU score
Total (23 DVU)
Cervical spine
Thoracic spine
Lumbar spine
6-DVU score
8-DVU score
10-DVU score
* DVU ⫽ discovertebral unit.
Standardized
response
mean
Pretreatment
Posttreatment
Effect
size
8.8 ⫾ 5.6
1.5 ⫾ 1.5
5.5 ⫾ 3.6
1.8 ⫾ 1.6
6.0 ⫾ 4.4
1.3 ⫾ 1.6
3.5 ⫾ 3.1
1.3 ⫾ 1.4
0.50
0.13
0.55
0.35
0.66
0.16
0.67
0.49
41.1 ⫾ 37.8
6.8 ⫾ 10.2
27.3 ⫾ 24.7
7.1 ⫾ 9.3
28.6 ⫾ 21.3
33.4 ⫾ 26.7
36.6 ⫾ 30.8
20.9 ⫾ 27.6
4.8 ⫾ 10.9
12.0 ⫾ 15.9
3.9 ⫾ 6.0
13.4 ⫾ 15.3
16.5 ⫾ 19.3
18.3 ⫾ 22.6
0.53
0.19
0.62
0.34
0.71
0.64
0.60
0.80
0.30
0.82
0.43
0.86
0.84
0.86
506
worst DVU. Alternatively, the reader may have made the
selection first but could still have chosen to change the
selection of the most severely affected DVU after scoring
the entire spine. The impact of this study design on the
reliability of the 23-DVU score is therefore not readily
apparent. In contrast, the feasibility of a study design in
which readers are asked to score 6, 8, 10, and 23 DVU in
independent reads with the increasing likelihood of recall,
particularly for severely affected DVU, is an open question.
The selection of the most severely affected DVU when
assessing pre- and posttreatment images is based on a
simultaneous assessment of these images using a 3-monitor review station. This readily permits simultaneous assessment of all spinal segments at both time points. Although limiting the selection to only the most severely
affected DVU potentially adds to the measurement error,
our data demonstrate that reliability of change scores is no
different whether all 23 DVU or only the most severely
affected DVU are scored. We consider it very important
that viewing conditions are organized in a manner that
readily permits simultaneous visualization of both preand posttreatment images.
One other scoring method for assessment of spinal inflammation by MRI has been published (3). This approach
is also based on the assessment of a spinal DVU and scores
bone edema and erosion in a single dimension from a
sagittal image according to the proportion of the anteroposterior length of the DVU involved. Scores are weighted
towards the presence of erosion and range from 0 to 6 per
DVU. This approach uses both T2-weighted and gadolinium-enhanced MRI sequences. This index was shown to be
reliable and responsive to change in patients receiving
anti-TNF␣ therapy. Recently, the scoring approach has
been modified to include the evaluation of edema only and
the range of scores per DVU has accordingly been reduced
to 0 –3 (8). There has been no further work to determine
whether a more focused approach to scoring the most
severely affected levels might perform equally well compared with scoring the entire spine. Systematic examination of spinal lesions by MRI using this latter scoring
method concurred with our observations that the majority
of affected levels were located in the thoracic spine, although that examination revealed somewhat more lesions
in the cervical spine than in the present study (8). Involvement of cervical DVU was evident in 16 –26% of patients,
although the number of affected cervical DVU per patient
was not provided. There were no obvious differences in
disease duration or severity that might account for these
differences with our observations. Our analyses were
based entirely on the assessment of STIR MRI sequences
and it is recognized that gadolinium-enhanced MRI may
reveal distinct lesions that score differently, although the
likelihood of this affecting the total score for a patient is
low (9,10). Both approaches to scoring omit lesions in the
posterior segment of the spine, including the facet joints,
processes, and interspinous ligaments, which have not yet
been systematically evaluated by MRI. Whether inclusion
of these regions will improve the metrologic properties of
MRI-based scoring systems requires further study.
Assessment of construct validity demonstrated that
Maksymowych et al
changes in spinal inflammation MRI scores primarily paralleled changes in CRP level regardless of the scoring
method used in our study. The lack of correlation with the
BASDAI may reflect the fact that the latter instrument is a
self-reported measure of patient symptoms such as pain,
stiffness, and fatigue and is therefore not necessarily specific for AS, but may equally reflect the symptomatology of
nonspecific causes of back pain. Additional sources of
back pain other than inflammation are possible in patients
with AS with long disease duration who may either develop secondary structural damage and/or concomitant
spinal disorders unrelated to AS.
Two reports have now shown that anti-TNF therapy for
⬎2 years reduces MRI scores for disease activity in the
spine as recorded by the Ankylosing Spondylitis Spinal
MRI score, although there is persisting disease that
amounts to 25–30% of the baseline score (11,12). This
could potentially raise concerns that a scoring system limited to only the most severely affected DVU might not
capture residual disease, limiting its ability to record more
effective treatment strategies. However, our data show that
posttreatment scores are 46.9%, 49.4%, and 50% of pretreatment 6-, 8-, and 10-DVU scores, respectively, and are
no different from an analysis of the entire spine (23-DVU
score), which shows a posttreatment score that is 50.9% of
the pretreatment score, allowing ample opportunity for
assessment of more effective treatment strategies.
In conclusion, the SPARCC MRI spinal inflammation
index requires assessment of the entire spine but performs
better with respect to responsiveness when analysis is
limited to a maximum of 6 most severely affected levels as
compared with results derived from scoring the entire
spine. Interreader reliability is excellent for both status
and change scores with either scoring approach. The use of
the 6-DVU scoring method should improve the feasibility
of this tool in clinical trials and research.
AUTHOR CONTRIBUTIONS
Dr. Maksymowych had full access to all of the data in the study
and takes responsibility for the integrity of the data and the
accuracy of the data analysis.
Study design. Maksymowych, Dhillon, Lambert.
Acquisition of data. Maksymowych, Dhillon, Park, Salonen, Inman, Lambert.
Analysis and interpretation of data. Maksymowych, Inman, Lambert.
Manuscript preparation. Maksymowych, Inman, Lambert.
Statistical analysis. Maksymowych.
REFERENCES
1. Battafarano DF, West SG, Rak KM, Fortenbery EJ, Chantelois
AE. Comparison of bone scan, computed tomography, and
magnetic resonance imaging in the diagnosis of active sacroiliitis. Semin Arthritis Rheum 1993;23:161–76.
2. Hermann KG, Bollow M. Magnetic resonance imaging of the
axial skeleton in rheumatoid disease [review]. Best Pract Res
Clin Rheumatol 2004;18:881–907.
3. Braun J, Baraliakos X, Golder W, Brandt J, Rudwaleit M,
Listing J, et al. Magnetic resonance imaging examinations of
the spine in patients with ankylosing spondylitis, before and
after successful therapy with infliximab: evaluation of a new
scoring system. Arthritis Rheum 2003;48:1126 –36.
4. Maksymowych WP, Inman RD, Salonen D, Dhillon SS, Krish-
Validation of the SPARCC MRI Spinal Index Score in AS
5.
6.
7.
8.
nananthan R, Stone M, et al. Spondyloarthritis Research Consortium of Canada magnetic resonance imaging index for assessment of spinal inflammation in ankylosing spondylitis.
Arthritis Rheum 2005;53:502–9.
Van der Heijde DM, Landewe RB, Hermann KG, Jurik AG,
Maksymowych WP, Rudwaleit M, et al. Application of the
OMERACT filter to scoring methods for magnetic resonance
imaging of the sacroiliac joints and the spine: recommendations for a research agenda at OMERACT 7. J Rheumatol
2005;32:2042–7.
Van der Linden S, Valkenburg HA, Cats A. Evaluation of
diagnostic criteria for ankylosing spondylitis: a proposal for
modification of the New York criteria. Arthritis Rheum 1984;
27:361– 8.
Garrett S, Jenkinson T, Kennedy LG, Whitelock H, Gasford P,
Calin A. A new approach to defining disease status in ankylosing spondylitis: the Bath Ankylosing Spondylitis Disease
Activity Index. J Rheumatol 1994;21:2286 –91.
Baraliakos X, Rudwaleit M, Listing J, Hermann KG, Brandt J,
Sieper J, et al. Magnetic resonance imaging in ankylosing
spondylitis: a detailed analysis [abstract]. Ann Rheum Dis
2005;64 Suppl 3:324.
507
9. Hermann KG, Landewe RB, Braun J, van der Heijde DM.
Magnetic resonance imaging of inflammatory lesions in the
spine in ankylosing spondylitis clinical trials: is paramagnetic
contrast medium necessary? J Rheumatol 2005;32:2056 – 60.
10. Baraliakos X, Hermann KG, Landewe R, Listing J, Golder W,
Brandt J, et al. Assessment of acute spinal inflammation in
patients with ankylosing spondylitis by magnetic resonance
imaging: a comparison between contrast enhanced T1 and
short tau inversion recovery (STIR) sequences. Ann Rheum
Dis 2005;64:1141– 4.
11. Sieper J, Baraliakos X, Listing J, Brandt J, Haibel H, Rudwaleit
M, et al. Persistent reduction of spinal inflammation as assessed by magnetic resonance imaging in patients with ankylosing spondylitis after 2 yrs of treatment with the anti-tumour necrosis factor agent infliximab. Rheumatology (Oxford)
2005;44:1525–30.
12. Baraliakos X, Brandt J, Listing J, Haibel H, Sorensen H, Rudwaleit M, et al. Outcome of patients with active ankylosing
spondylitis after two years of therapy with etanercept: clinical
and magnetic resonance imaging data. Arthritis Rheum 2005;
53:856 – 63.
Документ
Категория
Без категории
Просмотров
3
Размер файла
107 Кб
Теги
spina, scorm, spondyloarthritis, indexis, necessary, entire, validation, research, inflammation, magnetic, imagine, canada, resonance, consortia
1/--страниц
Пожаловаться на содержимое документа