close

Вход

Забыли?

вход по аккаунту

?

AJR.17.18249

код для вставкиСкачать
Gastrointestinal Imaging ? Original Research
Abramson et al.
CT of Breast Cancer Liver Metastases
Downloaded from www.ajronline.org by Vanderbilt Univ on 10/25/17 from IP address 129.59.95.115. Copyright ARRS. For personal use only; all rights reserved
Gastrointestinal Imaging
Original Research
The Attenuation Distribution
Across the Long Axis of Breast
Cancer Liver Metastases at CT:
A Quantitative Biomarker for
Predicting Overall Survival
Richard G. Abramson1
Nikita Lakomkin2
Allison Hainline 3
Hakmook Kang 3,4
M. Shane Hutson5,6
Carlos L. Arteaga7
Abramson RG, Lakomkin N, Hainline A, Kang H, Hutson MS, Arteaga CL
Keywords: biomarker, response assessment, texture
analysis
doi.org/10.2214/AJR.17.18249
Received March 17, 2017; accepted after revision
June�,�17.
R.G. Abramson received support from the NIH (grants
2U01CA142565, 5P50CA093131, and 5P30CA068485) and
from the Association of University Radiologists-GE
Radiological Research Fellowship award. A. Hainline and
H. Kang received support from the NIH (grant
2U01CA142565). C.L. Arteaga received support from the
NIH (grants 5P50CA093131 and 5P30CA068485).
Based on a presentation at the Radiological Society of
North America 2015 annual meeting, Chicago, IL.
1
Department of Radiology and Radiological Science,
Vanderbilt University School of Medicine, 1161 21st Ave
S, CCC-1121 MCN, Nashville, TN 37232-2675. Address
correspondence to R. G. Abramson
(richard.abramson@vanderbilt.edu).
2
Icahn School of Medicine at Mount Sinai, New York, NY.
3
Department of Biostatistics, Vanderbilt University
Medical Center, Nashville, TN.
4
Center of Quantitative Sciences, Vanderbilt University
Medical Center, Nashville, TN.
5
Department of Physics and Astronomy, Vanderbilt
University, Nashville, TN.
6
Department of Biological Sciences, Vanderbilt
University, Nashville, TN.
7
Vanderbilt-Ingram Cancer Center, Nashville, TN.
WEB
This is a web exclusive article.
AJR 2018; 210:W1?W7
0361?803X/18/2101?W1
� American Roentgen Ray Society
OBJECTIVE. The objective of our study was to compare attenuation distribution
across the long-axis (ADLA) measurements, Response Evaluation Criteria in Solid Tumors
(RECIST) version 1.1, and Choi criteria for predicting overall survival (OS) in patients with
metastatic breast cancer treated with bevacizumab.
MATERIALS AND METHODS. We obtained HIPAA-compliant data from a prospective, multisite, phase 3 trial of bevacizumab for the treatment of metastatic breast cancer. For
patients with one or more liver metastases measuring 15 mm or larger at baseline, we evaluated up to two target liver lesions using RECIST, Choi criteria, and ADLA measurements, with
the latter defined as the SD of the CT attenuation values of each pixel along the tumor longaxis diameter. The optimal percentage change threshold for defining an ADLA response was
computed by cross-validation analysis in a Cox model. The log-rank test was applied to evaluate RECIST, Choi criteria, and ADLA for discriminating patients with superior OS. The predictive accuracies of all three techniques were compared using Brier scores and areas under
the ROC curve (AUC). All analyses were performed separately using best overall response
(BOR) and response at the first follow-up time point (FU1).
RESULTS. One hundred sixty-four patients met the inclusion criteria. A 25% decrease in
the ADLA measurement from baseline was the optimal ADLA response threshold for BOR
and FU1. RECIST, Choi criteria, and ADLA successfully identified patients with superior
OS when using BOR (RECIST, p� 0.02; Choi and ADLA, p� 0.001), but only Choi criteria and ADLA measurements were successful when using FU1 (RECIST, p� 0.43; Choi and
ADLA, p� 0.001). In a direct comparison, ADLA measurements outperformed both RECIST
and Choi criteria using BOR (95% CI for Brier score differences, ADLA-RECIST [?0.58
to�0.08] and ADLA-Choi [?0.55 to�0.06]; 95% CI for AUC differences, ADLA-RECIST
[0.16?0.33] and ADLA-Choi [0.17?0.36]) as well as using FU1 (95% CI for Brier score differences, ADLA-RECIST [?0.77 to�0.08] and ADLA-Choi [?0.58 to�0.03]; 95% CI for AUC
differences, ADLA-RECIST [0.22?0.39] and ADLA-Choi [0.01?0.22]).
CONCLUSION. ADLA measurements may be a useful noninvasive indicator of cancer
treatment response. Because ADLA measurements may be extracted relatively easily using
existing radiologist workflows, further investigation of the ADLA technique is warranted.
n recent years, several novel imaging-based techniques have
been proposed for assessing and
predicting tumor response to anticancer therapy. Most of these techniques
have been suggested as adjuncts or alternatives to traditional lesion size?based response assessment methods, particularly the
Response Evaluation Criteria in Solid Tumors (RECIST) [1]. Most new proposals
seek to assess response earlier or more accurately than RECIST, with the goal of improv-
I
ing the evaluation of new drug treatments as
well as the selection and optimization of
therapy for patients in the clinic.
Novel response assessment proposals span
multiple modalities and range from simple
data extraction and processing [2] to complex
computational modeling techniques [3, 4]. Although advanced quantitative techniques have
shown promise for investigating tumor biology and facilitating preclinical drug development, these approaches are methodologically complex and tend to require significant
AJR:210, January 2018W1
Downloaded from www.ajronline.org by Vanderbilt Univ on 10/25/17 from IP address 129.59.95.115. Copyright ARRS. For personal use only; all rights reserved
Abramson et al.
time and resources, thus presenting a challenge to broad translation beyond dedicated
imaging core laboratories. By contrast, simpler approaches?such as Choi criteria [5]
and morphology, attenuation, size, and structure (MASS) criteria [6] for CT, and signal
enhancement ratio methods for dynamic
contrast-enhanced DCE-MRI [7]?generate
quantitative biomarkers from relatively easily extracted data elements and are presumably more easily incorporated into workflows.
These simpler techniques may have greater
potential for incorporation into multisite clinical trials and eventual broad clinical use.
In a previous effort, we introduced the attenuation distribution across the long axis
(ADLA) as a simple and easily extractable CT
biomarker for assessing treatment response in
solid malignancies [8]. We define ADLA as the
SD of the CT attenuation values of all the pixels across the long-axis diameter of a tumor.
The ADLA measurement is thus a measure
of intralesional heterogeneity that can be easily obtained as part of the typical radiologist?s
workflow; indeed, the extraction of ADLA
data requires no further effort than that associated with extracting a RECIST measurement.
In our previous pilot study, using data from a
phase 2 trial of a tyrosine kinase inhibitor in
KRAS-positive metastatic colorectal cancer,
we showed that ADLA measurements were superior to RECIST for identifying patients with
better survival [8].
The purpose of this study was to further
validate the ADLA technique by comparing
ADLA measurements with RECIST version
1.1 and Choi criteria for predicting overall
survival (OS) in a larger clinical trial. We performed this study using deidentified imaging
and clinical endpoint data from a prospective,
multisite, phase 3 trial of bevacizumab for the
treatment of metastatic breast cancer. In evaluating response criteria, we looked separately
at predictive performance using best overall
response (BOR) (i.e., the best response category achieved by a patient over the entire clinical trial) and response category at the first
follow-up time point (FU1). The latter was intended to evaluate the ability of each response
assessment method to predict survival early
after treatment initiation.
Materials and Methods
Study Population and Data Collection
Under a data-sharing agreement with Genentech (San Francisco, CA) and an institutional review board waiver from Vanderbilt University
Medical Center, we obtained deidentified imaging
W2
Fig. 1?Attenuation distribution along long axis (ADLA) measurements on three example metastatic liver
lesions in a 60-year-old woman (top row), 76-year-old woman (middle row), and 76-year-old woman (bottom
row). ADLA measurements represents SD of CT attenuation values of each pixel along long-axis diameter.
Higher ADLA values are associated with greater intralesional heterogeneity. CT images (left), histograms
(middle), and ADLA measurements (right) are shown.
and clinical data from the RIBBON-1 (Regimens
in Bevacizumab for Breast Oncology) trial, which
has been described in detail elsewhere [9]. In brief,
the RIBBON-1 trial was an international, multicenter, phase 3, randomized, double-blind, placebo-controlled trial of chemotherapy with or without bevacizumab for first-line treatment of human
epidermal growth factor receptor 2 (HER2)?negative metastatic breast cancer. Key inclusion criteria were age of 18 years or older, presence of locally recurrent or metastatic breast cancer with no
prior chemotherapy treatment, Eastern Cooperative Oncology Group performance status of 0 or 1,
and HER2-negative status.
From December 2005 to August 2007, 1237
patients were enrolled at 232 sites in 22 countries
and were evaluated in two cohorts: capecitabinebased chemotherapy with or without bevacizumab (median follow-up time, 15.6 months) and taxane- or anthracycline-based chemotherapy with
or without bevacizumab (median follow-up time,
19.2 months). The original study showed that the
addition of bevacizumab to chemotherapy was associated with improved median progression-free
survival (PFS) in both cohorts (PFS: capecitabine
cohort from 5.7 to 8.6 months, p� 0.001; taxane or
anthracycline cohort from 8.0 to 9.2 months, p�
0.001), but no statistically significant differences
in OS were observed in either cohort.
The current study was performed retrospectively using a subset of data from the RIBBON-1 trial.
We included imaging and clinical data for patients
meeting the following criteria: received bevacizumab, underwent CT (as opposed to another imaging modality) as the baseline imaging study, had
at least one liver metastasis measuring at least 15
mm at baseline (to minimize the effects of partial volume averaging through small lesions), and
had OS data available for clinical endpoint correlation. The dataset was assembled with images
from across the 232 sites in the trial; scanning acquisition parameters (e.g., hardware manufacturer, tube voltage, tube current?exposure time product, pitch, collimation) had been removed from the
DICOM headers before image transfer and were
not available for tabulation (see Discussion).
For each included patient at every imaging time
point while participating in the trial, up to two target liver lesions were evaluated using traditional RECIST (i.e., long-axis diameter), Choi criteria
AJR:210, January 2018
Response Thresholds
A treatment response by RECIST was defined
as a decrease in the sum of the long-axis diameters
of at least 30% from baseline, as stipulated by the
RECIST method [1].
Fig. 3?Calculation of optimal attenuation
distribution along long axis (ADLA) response
thresholds from 10-fold cross validation analysis.
A and B, Histograms show frequency of each ADLA
response threshold being selected as optimal. Cutoff
of�25% was chosen as optimal using both best
overall response (BOR) (A) and first follow-up time
point (FU1) (B), although�25% threshold was more
clearly dominant for FU1 than for BOR.
Original clinical trial
(n = 1237)
Did not receive bevacizumab
(n = 413)
Received bevacizumab
(n = 824)
No liver metastasis measuring
at least 15 mm at baseline
(n = 659)
At least one liver metastasis
measuring at least 15 mm at baseline
(n = 165)
Survival data not
available or corrupted
(n = 1)
Survival data
available for analysis
(n = 164)
Fig. 2?Flowchart shows patient inclusion process for selecting study group.
A treatment response by Choi criteria was defined as a decrease in tumor size of more than 10%
or a decrease in tumor density of more than 15%
from baseline [5].
A treatment response by ADLA was defined as
a decrease in the weighted average ADLA measurement of a certain threshold percentage from
baseline. We solved for the optimal ADLA response threshold using a cross-validation analysis,
described in the next section.
Computation of Optimal ADLA Response
Threshold
To find the optimal percentage change threshold
for defining a response by ADLA measurements,
1000
800
600
400
200
0
?40
?20
0
20
we performed a 10-fold cross-validation analysis
in which we computed Brier scores within a Cox
proportional hazards model. When used in a Cox
model, the Brier score [11, 12] compares a predicted survival status with the true survival status over
all observed survival times and returns a numeric
value. These values are 0 or greater, where 0 represents the best possible score (implying all correct
predictions) and larger scores represent decreasing
predictive ability. A lower Brier score thus characterizes superior predictive capacity.
Using a bootstrapping technique over 4000 replications, we generated Brier scores for ADLA measurements predicting OS using response thresholds
ranging from�50% to 50% at intervals of 5%. At
Frequency of Each Response Threshold Being Selected as Optimal
(long-axis diameter plus mean CT attenuation in
an ROI drawn within the tumor), and ADLA measurements. For patients with more than two hepatic
metastases, the two largest lesions were chosen for
analysis. An ADLA measurement was obtained for
each target lesion by calculating the SD of the CT
attenuation values of each pixel along the long-axis diameter (Fig. 1). ADLA measurements were extracted using publicly available Java (Oracle)-based
image processing and analysis software (ImageJ,
version IJ 1.46r, National Institutes of Health) [10].
All measurements (RECIST, Choi, and ADLA)
were obtained on contrast-enhanced CT scans in
the portal venous phase. Data extraction was performed by a trained medical student under the supervision of a subspecialty abdominal imaging radiologist with 12 years of experience.
For examinations with two target lesions, composite measurements were generated for each time
point. The composite measurement for RECIST
was the sum of the long-axis diameters, as stipulated by the RECIST method [1]. The composite
measurements for the Choi criteria were the sum of
the long-axis diameters and the mean tumor density
across both target lesions [5]. The composite measurement for ADLA was the weighted average of
both ADLA measurements using the long-axis diameter of each target lesion as the weighting factor.
Frequency of Each Response Threshold Being Selected as Optimal
Downloaded from www.ajronline.org by Vanderbilt Univ on 10/25/17 from IP address 129.59.95.115. Copyright ARRS. For personal use only; all rights reserved
CT of Breast Cancer Liver Metastases
40
Response Threshold (%)
A
2000
1500
1000
500
0
?40
?20
0
20
40
Response Threshold (%)
B
AJR:210, January 2018W3
Abramson et al.
Fig. 4?Kaplan-Meier curves for three therapy
response assessment techniques.
A and B, Kaplan-Meier curves for Response
Evaluation Criteria in Solid Tumors (RECIST), Choi
criteria, and attenuation distribution along long axis
(ADLA) measurements using best overall survival (A)
and first follow-up time point (B). Response threshold
of�25% was used for ADLA measurements.
1.0
Proportion of Patients Who Survived
Proportion of Patients Who Survived
Downloaded from www.ajronline.org by Vanderbilt Univ on 10/25/17 from IP address 129.59.95.115. Copyright ARRS. For personal use only; all rights reserved
1.0
0.8
0.6
0.4
0.2
0.8
0.6
0.4
0.2
0
0
0
200
400
600
Time (days)
800
0
ADLA measurements responder (p < 0.001)
ADLA measurements nonresponder
Choi criteria responder (p < 0.001)
Choi criteria nonresponder
RECIST responder (p = 0.02)
RECIST nonresponder
200
400
600
Time (days)
800
ADLA measurements responder (p < 0.001)
ADLA measurements nonresponder
Choi criteria responder (p < 0.001)
Choi criteria nonresponder
RECIST responder (p = 0.02)
RECIST nonresponder
A
B
each bootstrap replication, the optimal threshold
was recorded, where optimality was defined as the
threshold that returned the lowest Brier score. We
then plotted histograms showing the frequency of
each possible threshold being selected as optimal.
To account for the possibility of different optimal
ADLA response thresholds when using BOR versus FU1, we computed optimal response thresholds
separately for BOR and FU1.
1000 replications, we computed the 95% CI around
the difference in Brier scores for each combination.
If the CI did not include zero, we inferred a statistically significant difference in predictive ability between the two methods. We performed these comparison analyses separately using BOR and FU1.
For the AUC analysis, we performed a similar
bootstrapping technique to calculate differences in
AUC for each model fit. In this procedure, an integrated AUC value was obtained for each bootstrap
replication and a difference in AUC was calculated
for each response method combination. The 95% CI
for AUC was taken to extend from the 2.5 to the 97.5
percentiles of the bootstrapped distribution. Again,
if the CI did not include zero, we inferred a statistically significant difference in the predictive ability
between the two methods. We performed these comparison analyses separately using BOR and FU1.
Evaluation of Ability to Discriminate Superior
Overall Survival
Kaplan-Meier curves were constructed for responders and nonresponders as defined by each
response assessment method using the response
thresholds discussed earlier. We applied the logrank test to evaluate the ability of each method to
discriminate patients with superior OS. This analysis was performed separately for BOR and FU1.
Comparison of Response Assessment Methods
We then compared the three response methods
against one another for their ability to discriminate
patients with superior OS. We did so using two different statistical tools: Brier scores and areas under
the ROC curve (AUCs). For the Brier score analysis, we looked at the difference in Brier scores for
each response method combination: RECIST versus ADLA, ADLA versus Choi, and RECIST versus Choi. Using a bootstrapping technique over
W4
Statistical Analysis
All statistical analyses were performed using
R, version 3.4.0 (R Foundation). Differences were
deemed to be statistically significant at p� 0.05.
We controlled the false discovery rate at 0.1 to adjust for multiple comparisons [13].
Results
Patient Population
One hundred sixty-four patients met the
inclusion criteria (Fig. 2). The median sur-
vival for these patients was 461 days (range,
60?916 days).
Optimal ADLA Response Threshold
From our 10-fold cross-validation analysis, the optimal ADLA response threshold for
both BOR and FU1 was a decrease of 25%
from baseline (Fig. 3). The�25% cutoff was
clearly dominant for FU1 but was only marginally dominant for BOR (see Discussion).
Ability to Discriminate Superior Overall Survival
Figure 4 presents Kaplan-Meier curves
for all three response assessment techniques.
The�25% threshold was used for defining a
response by ADLA measurements.
Tables 1 and 2 present details regarding
each technique?s ability to discriminate patients with superior OS. All three response
assessment techniques successfully discriminated patients with superior OS when using
BOR (RECIST, p� 0.02; ADLA and Choi,
p� 0.001). The ADLA and Choi techniques
also successfully discriminated patients
with superior OS when using FU1 (both p�
0.001), but RECIST did not (p� 0.43).
Comparison of Response Assessment Methods
Using the Brier score method, when BOR
was used to predict OS, ADLA measure-
AJR:210, January 2018
CT of Breast Cancer Liver Metastases
TABLE 1:? Ability of Response Assessment Methods to Predict Overall Survival
Using Best Overall Response
Subjects Classified as
Responders
Downloaded from www.ajronline.org by Vanderbilt Univ on 10/25/17 from IP address 129.59.95.115. Copyright ARRS. For personal use only; all rights reserved
Method
No.
Subjects Classified as
Nonresponders
Median OS (d)
No.
Median OS (d)
Hazard Ratio (95% CI)
pa
RECIST
74
505.5
90
443
0.55 (0.33?0.92)
0.02
Choi criteria
136
480.5
28
338.5
0.26 (0.14?0.47)
< 0.001
ADLA
89
532
75
326
0.12 (0.07?0.23)
< 0.001
Note?OS� overall survival, RECIST� Response Evaluation Criteria in Solid Tumors version 1.1, ADLA�
attenuation distribution across the long axis.
aLog-rank test.
TABLE 2:? Ability of Response Assessment Methods to Predict Overall Survival
Using Response Status at First Follow-Up
Subjects Classified as
Responders
Method
Subjects Classified as
Nonresponders
Median OS (d) Hazard Ratio (95% CI)
pa
No.
Median OS (d)
No.
RECIST
46
503.5
118
446.5
0.80 (0.46?1.39)
0.43
Choi criteria
122
499.5
42
341.5
0.25 (0.14?0.42)
< 0.001
ADLA
77
532
87
351
0.15 (0.08?0.27)
< 0.001
Note?OS = overall survival. RECIST� Response Evaluation Criteria in Solid Tumors version 1.1, ADLA�
attenuation distribution across the long axis.
aLog-rank test.
ments outperformed both RECIST and Choi
criteria, as evidenced by 95% CIs for Brier
score differences that were less than zero
and did not include zero (95% CI for Brier
score differences: ADLA-RECIST,�0.58
to�0.08; ADLA-Choi,�0.55 to�0.06).
The predictive abilities of Choi criteria and
RECIST using BOR were not statistically
different, given that the 95% CI for the Brier
score difference did contain zero (95% CI for
Brier score differences: Choi-RECIST,�0.17
to 0.02). When FU1 was used to predict OS,
ADLA measurements outperformed both
RECIST and Choi criteria (95% CI for Brier score differences: ADLA-RECIST,�0.77
to�0.08; ADLA-Choi,�0.58 to�0.03), and
Choi criteria outperformed RECIST (ChoiRECIST,�0.42 to�0.04).
Using the AUC method, when BOR was
used to predict OS, ADLA measurements
outperformed both RECIST and Choi criteria, as evidenced by 95% CIs for AUC differences that were greater than zero and did not
include zero (95% CI for AUC differences:
ADLA-RECIST, 0.16?0.33; ADLA-Choi,
0.17?0.36). The predictive abilities of Choi
criteria and RECIST using BOR were not
statistically different, because the 95% CI for
the AUC difference did contain zero (95% CI
for AUC difference: Choi-RECIST,�0.12 to
0.08). When FU1 was used to predict OS,
ADLA measurements outperformed both
RECIST and Choi criteria (95% CI for AUC
differences: ADLA-RECIST, 0.22?0.39;
ADLA-Choi, 0.01?0.22), and Choi criteria outperformed RECIST (Choi-RECIST,
0.10?0.28).
Discussion
Using retrospective data from a large
phase 3 clinical trial, this study systematically defined an optimal response threshold
for ADLA measurements and then compared
RECIST, Choi criteria, and ADLA measurements for identifying patients with longer survival. The optimal ADLA response
threshold was found to be�25% for both
BOR and FU1. Although all three response
assessment methods successfully discriminated patients with longer OS using BOR,
only Choi criteria and ADLA measurements
were successful using FU1. In a direct comparison using both Brier scores and AUCs to
evaluate ability to predict OS, ADLA measurements outperformed both RECIST and
Choi criteria using both BOR and FU1.
Imaging biomarkers for antitumor response
are generally evaluated on the strength of their
relationship with clinical survival endpoints,
primarily OS. Although PFS is becoming
more common as an endpoint in cancer clinical trials, it is in fact a hybrid endpoint combin-
ing both imaging and clinical response and is
therefore not typically used to evaluate the predictive performance of novel imaging response
biomarkers. Historically, the relationship between an imaging biomarker and OS has been
tested using BOR, which has been until recently the most common surrogate endpoint
in cancer clinical trials and the most common
surrogate endpoint for regulatory approval of
new oncology drugs [14]. However, there is increasing interest in the predictive performance
of imaging biomarkers early during therapy
because the use of early response biomarkers
may facilitate adjusting and optimizing drug
regimens and may help patients avoid debilitating side effects from ineffective therapies. For
this reason, we investigated the predictive performance of ADLA measurements using both
BOR and FU1.
The superior predictive ability of ADLA
measurements for OS using both BOR and FU1
implies that, at least in certain contexts, ADLA
measurements may better discriminate patients
with a survival benefit and may therefore be a
better noninvasive indicator of drug efficacy
than either RECIST or Choi criteria. Importantly, the failure of RECIST to predict OS at FU1
adds to mounting evidence in the literature suggesting that RECIST measurements may lag
behind a true biologic antitumor response, especially for newer targeted antitumor agents [15].
The notable performance of ADLA measurements at FU1 suggests it may be a candidate
for eventual incorporation into early response?
adaptive clinical trials or treatment algorithms,
in which patients are triaged to different treatment strategies on the basis of imaging results
at an early follow-up time point [16].
The superior predictive ability of ADLA
measurements in this study is especially notable given its relative simplicity compared with
other competing methods. The data extraction
for an ADLA measurement requires measuring
only a tumor?s long-axis diameter?that is, the
same workflow as for a RECIST measurement.
The limited workflow burden with both ADLA
and Choi methods contrasts sharply with that
associated with more complex and computationally intensive techniques such as contrast-enhanced perfusion CT and DCE-MRI.
Although this study relied on publicly available image-processing software for generating
ADLA values from tumor long-axis measurements, this functionality could easily be incorporated into standard PACS software. Indeed,
our team has already begun testing simultaneous RECIST and ADLA measurements in a
software plug-in for an annotation and image
AJR:210, January 2018W5
Downloaded from www.ajronline.org by Vanderbilt Univ on 10/25/17 from IP address 129.59.95.115. Copyright ARRS. For personal use only; all rights reserved
Abramson et al.
markup (AIM)-compatible quantitative imaging informatics platform [17].
This study also adds to emerging evidence in the literature on CT texture analysis as a technique for generating potentially useful biomarkers of tumor heterogeneity
[18]. Tumor heterogeneity as represented by
CT texture has been shown to predict survival in primary colorectal cancer [19], non?
small cell lung cancer [20], primary esophageal cancer [21], primary [22] and metastatic
[23, 24] renal cell carcinoma, and locally advanced squamous cell carcinoma of the head
and neck [25]. Lubner et al. [26] showed that
CT texture features correlated with both tumor grade and OS in cases of colorectal carcinoma metastases to the liver. Smith et al.
[27] showed that a multivariate model including CT texture features predicted OS in metastatic melanoma treated with antiangiogenic
therapy. These studies used a variety of heterogeneity parameters including entropy, kurtosis, skewness, mean positive pixels, and SD
of pixel histograms. At least one commercial
software package (TexRAD, Sussex Innovation Centre) is available to facilitate CT texture analyses.
The fundamental premise underlying the
ADLA technique is that tumors responding
to therapy will become more homogeneous
in CT attenuation at follow-up time points.
We should emphasize that treatment response on CT does not follow this pattern in
all tumors; indeed, some treated malignancies may become more heterogeneous with
successful treatment on the basis of developing partial necrosis or dystrophic calcification. It is therefore likely that the ADLA
technique will be valid for only certain tumor types, organs, and therapies. Additional
studies will be required to understand where
ADLA measurements are most informative
and appropriate.
This study was performed in HER2-positive breast cancer metastases to the liver treated with chemotherapy plus bevacizumab. In
this setting, we did observe decreasing tumor
heterogeneity in lesions responding to therapy. The biologic basis for this phenomenon is
not entirely clear. On a microscopic level, the
increasing attenuation homogeneity may have
been because of decreasing tumor perfusion,
decreasing capillary permeability, increasing
cytotoxic edema, cell lysis, expanding extracellular water content, or other factors. Importantly, these mechanisms may be present in varying
combinations and with varying magnitudes depending on the tumor type, organ, and therapy.
W6
Our cross-validation analysis yielded an
optimal ADLA response threshold of�25%.
This response threshold was clearly dominant for FU1 but was only marginally dominant for BOR, implying that other response
thresholds may also perform well for BOR.
For this reason, future study will be required
to reconfirm the� ?25% response threshold
and to test its stability across tumor types,
organs, and therapies.
This study has several limitations. First,
because our dataset was drawn retrospectively from a multisite clinical trial, there was
variability across the CT images in scanning
parameters, including tube voltage, tube current, pixel size, contrast bolus timing, and reconstruction kernel. This variability may be a
strength rather than a weakness of the study
because it shows that the ADLA technique is
robust to variations in scan acquisition, which
would be needed if the technique were translated into broad clinical use. Nevertheless, future study is warranted to establish the tolerance limits for various CT parameters beyond
which the performance of the ADLA method
may begin to deteriorate, as well as the relationship of ADLA measurements to potential
confounding variables such as body mass index (which may affect image noise).
Second, our study focused exclusively on
liver metastases measuring at least 15 mm
at baseline. Because of these narrow inclusion criteria, only 164 of 1237 subjects in the
original clinical trial qualified for analysis;
although the parent clinical trial was randomized and well stratified, we cannot exclude the possibility of selection bias having
occurred if our smaller sample acquired biologic characteristics that were not representative of the larger population. With regard to
our focus on liver metastases, future studies
may elucidate whether the ADLA method is
useful in other organ systems, with modifications as necessary (including possible use of
different response thresholds). With regard
to the 15-mm lesion size threshold, we ideally would have performed a subanalysis stratifying predictive performance across a range
of lesion sizes including smaller lesions; this
subanalysis was not possible with this study
because of statistical power limitations but
will be an important goal of future evaluations with larger datasets.
A third potential limitation is that by having performed a cross-validation analysis to
determine the optimal response threshold for
ADLA measurements and not having done
so for RECIST or Choi criteria, we may have
performed an unfair comparison between the
different techniques. Our intention in this
study was to test the ADLA method against
the standard formulations of RECIST and
Choi criteria, noting that both techniques
(especially RECIST) have been validated in
various settings and are currently being used
across a wide variety of clinical trials. However, we cannot discount the possibility that
different RECIST or Choi response thresholds may have performed better in this particular setting.
A final limitation of this study is that data
extractions were performed by only one observer. We therefore are unable to present data
on interobserver variability. In light of these
various study limitations, we prefer to characterize our results as promising preliminary
data on which to build future investigations.
In conclusion, this study systematically
defined an optimal response threshold for
ADLA measurements and showed, within
a dataset from a large phase 3 clinical trial,
that ADLA measurements had better predictive ability than either RECIST or Choi technique for identifying patients with longer OS
using both BOR and FU1. Given that ADLA
measurements can be extracted relatively
easily compared with other investigational
response biomarkers, our results justify further investigation of ADLA measurements in
different clinical settings, as well as exploration of incorporating ADLA measurements
into image annotation and markup software.
ADLA measurements may be well suited for
use in adaptive design clinical trials where
response is evaluated and acted on at an early
time point during therapy. With further validation, the ADLA technique may be positioned for broad clinical translation given its
possible usefulness beyond RECIST and its
potential for being incorporated into radiology workflows without significant additional
demands on time and resources.
References
1.Eisenhauer EA, Therasse P, Bogaerts J, et al. New
Response Evaluation Criteria in Solid Tumours:
revised RECIST guideline (version 1.1). Eur燡
Cancer 2009; 45:228?247
2.Chun YS, Vauthey JN, Boonsirikamchai P, et al.
Association of computed tomography morphologic criteria with pathologic response and survival in patients treated with bevacizumab for
colorectal liver metastases. JAMA 2009;
302:2338?2344
3.Li X, Abramson RG, Arlinghaus LR, et al. Multiparametric magnetic resonance imaging for pre-
AJR:210, January 2018
Downloaded from www.ajronline.org by Vanderbilt Univ on 10/25/17 from IP address 129.59.95.115. Copyright ARRS. For personal use only; all rights reserved
CT of Breast Cancer Liver Metastases
dicting pathological response after the first cycle
of neoadjuvant chemotherapy in breast cancer.
Invest Radiol 2015; 50:195?204
4.Weis JA, Miga MI, Arlinghaus LR, et al. A mechanically coupled reaction: diffusion model for
predicting the response of breast tumors to neoadjuvant chemotherapy. Phys Med Biol 2013;
58:5851?5866
5.Choi H, Charnsangavej C, Faria SC, et al. Correlation of computed tomography and positron emission tomography in patients with metastatic gastrointestinal stromal tumor treated at a single
institution with imatinib mesylate: proposal of
new computed tomography response criteria.
J燙lin Oncol 2007; 25:1753?1759
6.Smith AD, Shah SN, Rini BI, Lieber ML, Remer
EM. Morphology, attenuation, size, and structure
(MASS) criteria: assessing response and predicting clinical outcome in metastatic renal cell carcinoma on antiangiogenic targeted therapy. AJR
2010; 194:1470?1478
7.Hylton NM, Blume JD, Bernreuter WK, et al.;
ACRIN 6657 Trial Team and I-SPY 1 TRIAL Investigators. Locally advanced breast cancer: MR
imaging for prediction of response to neoadjuvant
chemotherapy?results from ACRIN 6657/I-SPY
TRIAL. Radiology 2012; 263:663?672
8.Lakomkin N, Kang H, Landman B, Hutson MS,
Abramson RG. The attenuation distribution
across the long axis (ADLA): preliminary findings for assessing response to cancer treatment.
Acad Radiol 2016; 23:718?723
9.Robert NJ, Dieras V, Glaspy J, et al. RIBBON-1:
randomized, double-blind, placebo-controlled,
phase III trial of chemotherapy with or without
bevacizumab for first-line treatment of human
epidermal growth factor receptor 2-negative, locally recurrent or metastatic breast cancer. J燙lin
Oncol 2011; 29:1252?1260
10.Schneider CA, Rasband WS, Eliceiri KW. NIH
Image to ImageJ: 25 years of image analysis. Nat
Methods 2012; 9:671?675
11.Arkes HR, Dawson NV, Speroff T, et al. The covariance decomposition of the probability score and
its use in evaluating prognostic estimates. SUPPORT Investigators. Med Decis Making 1995;
15:120?131
12.Brier GW. Verification of forecasts expressed in
terms of probability. Monthly Weather Rev 1950;
78:1?3
13.Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to
multiple testing. J R Stat Soc Br 1995; 57:289?300
14.Johnson JR, Williams G, Pazdur R. End points
and United States Food and Drug Administration
approval of oncology drugs. J燙lin Oncol 2003;
21:1404?1411
15.Ratain MJ, Eckhardt SG. Phase II studies of modern drugs directed against new targets: if you are
fazed, too, then resist RECIST. J燙lin Oncol 2004;
22:4442?4445
16.von Minckwitz G, Blohmer JU, Costa SD, et al.
Response-guided neoadjuvant chemotherapy for
breast cancer. J燙lin Oncol 2013; 31:3623?3630
17.Rubin DL, Willrett D, O?Connor MJ, Hage C,
Kurtz C, Moreira DA. Automated tracking of
quantitative assessments of tumor burden in clinical trials. Transl Oncol 2014; 7:23?35
18.Ganeshan B, Miles KA. Quantifying tumour heterogeneity with CT. Cancer Imaging 2013;
13:140?149
19.Ng F, Ganeshan B, Kozarski R, Miles KA, Goh V.
Assessment of primary colorectal cancer heterogeneity by using whole-tumor texture analysis:
contrast-enhanced CT texture as a biomarker of
5-year survival. Radiology 2013; 266:177?184
20.Ganeshan B, Panayiotou E, Burnand K, Dizdarevic S, Miles K. Tumour heterogeneity in non-
small cell lung carcinoma assessed by CT texture
analysis: a potential marker of survival. Eur Radiol 2012; 22:796?802
21.Yip C, Landau D, Kozarski R, et al. Primary
esophageal cancer: heterogeneity as potential
prognostic biomarker in patients treated with definitive chemotherapy and radiation therapy. Radiology 2014; 270:141?148
22.Lubner MG, Stabo N, Abel EJ, Del Rio AM, Pickhardt PJ. CT textural analysis of large primary
renal cell carcinomas: pretreatment tumor heterogeneity correlates with histologic findings and
clinical outcomes. AJR 2016; 207:96?105
23.Goh V, Ganeshan B, Nathan P, Juttla JK, Vinayan
A, Miles KA. Assessment of response to tyrosine
kinase inhibitors in metastatic renal cell cancer:
CT texture as a predictive biomarker. Radiology
2011; 261:165?171
24.Haider MA, Vosough A, Khalvati F, Kiss A, Ganeshan B, Bjarnason GA. CT texture analysis: a
potential tool for prediction of survival in patients
with metastatic clear cell carcinoma treated with
sunitinib. Cancer Imaging 2017; 17:4
25.Zhang H, Graham CM, Elci O, et al. Locally advanced squamous cell carcinoma of the head and
neck: CT texture and histogram analysis allow
independent prediction of overall survival in patients treated with induction chemotherapy. Radiology 2013; 269:801?809
26.Lubner MG, Stabo N, Lubner SJ, et al. CT textural
analysis of hepatic metastatic colorectal cancer:
pre-treatment tumor heterogeneity correlates with
pathology and clinical outcomes. Abdom Imaging
2015; 40:2331?2337
27.Smith AD, Gray MR, del Campo SM, et al. Predicting overall survival in patients with metastatic melanoma on antiangiogenic therapy and RECIST stable disease on initial posttherapy images using CT
texture analysis. AJR 2015; 205:[web]W283?W293
AJR:210, January 2018W7
Документ
Категория
Без категории
Просмотров
0
Размер файла
865 Кб
Теги
ajr, 18249
1/--страниц
Пожаловаться на содержимое документа