Evaluation of RECIST in chemotherapy-treated lung cancer: the Pharmacogenoscan Study

Background Response Evaluation Criteria in Solid Tumors (RECIST) are widely used to assess the effect of chemotherapy in patients with cancer. We hypothesised that the change in unidimensional tumour size handled as a continuous variable was more reliable than RECIST in predicting overall survival (OS). Methods The prospective Pharmacogenoscan study enrolled consecutive patients with non-small-cell lung cancer (NSCLC) at any stage seen between 2005 and 2010 at six hospitals in France, given chemotherapy. After exclusion of patients without RECIST or continuous-scale tumour size data and of those with early death, 464 patients were left for the survival analyses. Cox models were built to assess relationships between RECIST 1.1 categories or change in continuous-scale tumour size and OS. The best model was defined as the model minimising the Akaike Information Criterion (AIC). Results OS was 14.2 months (IQR, 7.3-28.9 months). According to RECIST 1.1, 146 (31%) patients had a partial or complete response, 245 (53%) stable disease, and73 (16%) disease progression. RECIST 1.1 predicted better OS than continuous-scale tumour in early (<6 months) predicted survival analyses (p = 0.03) but the accuracy of the two response evaluation methods was similar in late (≥6 months) predicted survival analyses (p = 0.15). Conclusion In this large observational study, change in continuous-scale tumour size did not perform better than RECIST 1.1 in predicting survival of patients given chemotherapy to treat NSCLC. Trial registration NCT00222404


Background
Response Evaluation Criteria in Solid Tumors (RECIST) was developed in 2000 [1] to assess changes in solid tumour size in patients given cancer chemotherapy. RECIST criteria are based on the sum of the maximum diameters of target lesions seen on imaging studies. This value is categorised as follows: complete/partial response (CR/PR), complete disappearance of all targets/greater than 30% decrease; stable disease (SD), change between −30% and +20%; and progressive disease (PD), greater than 20% increase. The initial RECIST guidelines (RECIST 1.0) were revised in 2009 (RECIST 1.1) [2] to improve the definitions of the target lesions. RECIST categories have gradually superseded the World Health Organisation (WHO) criteria for chemotherapy effects, which use bidimensional tumour measurements to define CR, PR, SD, and PD [3]. RECIST categories were found to be associated with survival [4,5].
It has been suggested that a patient with 15% tumour shrinkage may have a better survival than a patient with 15% tumour growth, although both patients fall in the stable-disease category according to RECIST criteria. Therefore, the change in tumour size from baseline handled as a continuous variable, which differentiates such patients, might help to assess antitumor activity and to predict survival [6]. Data from phase I [7], II [8], and III [9,10] clinical trials indicate that unidimensional continuous-scale tumour size (UCSTS) measurement is feasible and probably useful for assessing treatment effectiveness, particularly when the number of patients is small. However, UCSTS measurement may be difficult to perform. Further new lesions cannot be quantified numerically, and consequently are generally classified as PD or ignored.
Pharmacogenoscan is a large translational study aimed at associating the molecular profiles of tumour and blood samples with the response to first-line chemotherapy in patients with non-small-cell lung cancer (NSCLC). The primary objective of the study reported here was to compare the performance of UCSTS changes and RECIST 1.1 categories in predicting overall survival (OS) in the Pharmacogenoscan cohort.

Patients and study design
The Pharmacogenoscan study is a prospective study conducted in six hospitals in the Rhône-Alpes-Auvergne region of France to identify biological and histological factors associated with outcomes of patients with NSCLC. The study was approved by the ethics committee of the Grenoble University Hospital (NCT00222404), and all patients gave written informed consent before study inclusion.
Consecutive patientswith chemotherapy-naive NSCLC at any stage [11,12] seen between July 2005 and August 2010 and having an ECOG-performance status (PS) [13] of 0 to 2 were included if they received platinum-based doublet chemotherapy as either neo-adjuvant or firstline treatment for metastatic or recurrent disease. All clinical data were recorded prospectively. Missing data were retrieved by one of us (ACT) before database lock.
As shown in the patient flow chart (Figure 1), we included 550 patients. We excluded 67 patients because of an inability to evaluate UCSTS changes or because of early death or disease progression. Landmark analysis [14,15] was performed in these patients with a time point taken at 6 weeks. Finally, RECIST categories were available and studied for 464 patients.

Tumour size evaluation
Baseline imaging was performed 20 days (25%-75% interquartile range (IQR), 12-31 days) before chemotherapy initiation, and the first follow-up evaluation occurred after two or three chemotherapy cycles (median, 42 days; IQR, 35-47 days) as decided by the investigator. Targets were measured on computed tomography (CT) images and reassessed for the purpose of the study by at least one physician specialised in thoracic oncology (ACT, DMS, PM, MP, PJS, PR, or PC). RECIST 1.1 response categories were determined [2]. For patients included before 2009, RECIST 1.0 categories were converted into RECIST 1.1 categories [16]. A systematic blind review of tumour response according to RECIST criteria was performed in a random sample of patients included by independent investigators belonging to different centres. The change in UCSTS over time was computed as follows: (UCSTS at first follow-up evaluation -UCSTS at baseline)/UCSTS at baseline. For the patients with at least one new lesion, we assigned a 100% increase in UCSTS measurement.

Statistical analysis
Distributions of continuous variables were summarized by median (IQR) and of categorical variables by counts (percentages). Patients who were lost to follow-up by 1 December 2012 were considered censored. Follow-up duration was defined as the time from the first chemotherapy dose to last follow-up, and OS as the time from the first chemotherapy dose to death.
Kaplan-Meier curves of OS were plotted and compared between groups using the log-rank test. Univariate analyses were used to identify factors associated with OS. Variables associated with P-values lower than 0.20 by univariate analysis and those known to affect OSwere proposed to multivariate Cox models using a stepwise procedure. Variables associated with p < 0.05 in the multivariate context were kept in the models. The proportional hazard assumption was checked using martingale residuals. For this assumption to be plausible, we separately analysed early (<6 months) and late (≥6 months) deaths, using Cox  models. Analyses were stratified for the ECOG-PS and the hospital. Hazard ratios (HRs) with their 95% confidence intervals (95% CIs) and p values were computed. We compared the non-nested RECIST and UCSTS models based on the Akaike Information Criterion (AIC).The AIC is maximum likelihood function penalised by the number of variables included in the model. It offers a relative estimate of the information lost when a given model is used to represent the process that generates the data. It defines the best model as the one with the lowest AIC value [17]. To evaluate the significance of the AICs' difference, we calculated a chi-square (difference between -2log likelihood of both models) at a degree of freedom (difference between degrees of freedom of both models). Thus we obtained a P-value. All statistical analyses were performed using SAS 9.3 (SAS Institute, Cary, NC, USA). Figure 1 is the patient flow chart. Table 1 lists the main patient characteristics. The majority of patients had inoperable cancer (82%) and/or an ECOG-PS of 0 or 1

Tumour response
According to RECIST, 146 (31%) patients had a CR or PR, 245 (53%) SD, and 146 (31%) PD. The systematic review of the tumour response was performed for the first 64 (14%) patients and showed agreement with the initial evaluation in 60 (94%). The discrepancies were resolved by discussion.
Patients with measurable changes in tumour size had changes ranging from a 100% decrease to a 100% increase, with 347 (75%) showing at least some decrease according to RECIST (Figure 2). In this group, 1-year mortality was about 50% for UCSTS changes between −100% and +20% (CR + PR + SD RECIST categories) and greater than 80% for UCSTS increases greater than 20% (PD RECIST category). In non metastatic tumours, predicted survival was related linearly with the logit of percentage of response without clear cut-off.

Association between tumour response and overall survival (OS)
By univariate analysis (Table 1), ECOG-PS, histologic cancer type, and cancer spread (TNM classification) were significantly associated with OS. Response according to RECIST was associated with OS (p < 10 −4 , Figure 4). Tables 2 and 3 show the results of the Cox models for the first 6 months and subsequent period, respectively. The analysis was routinely adjusted on histology and cancer spread. Sex and chemotherapy doublet were proposed to the model but not kept at the final step. The analysis was  stratified on ECOG-PS as explained in the method, and on hospital. Accuracies as estimated by the AIC were better for RECIST 1.1 (p = 0.03) than for UCSTS, even after adjustment on confounders for early survival. However, no difference in accuracy was found between RECIST 1.1 and UCSTS for late survival (p = 0.15).

Discussion
In our study, tumour response to chemotherapy evaluated based on either RECIST or UCSTS was strongly associated with OS. UCSTS did not perform better than RECIST in predicting OS.We studied a large cohort of patients with NSCLC selected only based on ECOG-PS. Our population is representative of the NSCLC patients seen in daily practice. In particular, most patients had adenocarcinoma and 60% had metastatic disease. RECIST 1.1 was superior over UCSTS in predicting early survival, probably due to the weight of poor prognosis among patients with PD. For predicting late survival, the two methods were similarly accurate. We are aware of a single previous study [19] of UCSTS versus WHO criteria for predicting survival. The patients had colorectal cancer and were given conventional chemotherapy. UCSTS did not perform better than WHO criteria with the three categories CR/PR, SD, and PD. One possible explanation is the variability in CT measurements of tumour size. In a study of 33 patients with NSCLC, the interobserver relative measurement change in unidimensional tumour size measurements varied from 0% to 194% [20]. In patients with a variety of thoracic and abdominal tumours, the results obtained by a single observer compared with multiple observers differed by more than 10% for 83% of lesions [21]. Finally, among patients with NSCLC, measurements on two CT scans obtained within less than 15 minutes often showed differences exceeding 1 or 2 mm [22].Thus, small changes in tumour size should be interpreted with caution. When using the WHO criteria,tumour measurement errors can be expected to produce objective response rates ranging from of 5% to 10% [23].Tumour response evaluation using RECIST appeared more reproducible. In our study, the review of results by a panel of experts found errorsin only 6% of patients. Andoh et al. [24] assessed the quality of radiology reports requiring RECIST and found that the combination of distributed educational materials and audit and feedback interventions improved radiology report quality by reducing the number of studies with errors from 30% to 22%.
Response rates (responders versus non-responders) do not include SD in the response category. Response is a common endpoint in phase III studies but performs poorly in predicting survival [25]. Patients can experience clinical benefits from treatment without a significant change in tumour size. Our results in the subgroup with metastatic disease suggest that a 20% increase in UCSTS (PD) may be a satisfactory cut-off for separating two prognostic groups. A study of patients with NSCLC [26] is consistent with this assumption: the 8-week rate   of disease control, defined as no disease progression, performed better in predicting survival than did the 8-week rate of CR/PR. Recently, Mandrekar et al. [27] published the results from 13 trials including patients with metastatic cancer. They confirmed the utility of the RECIST-based response metrics. No alternative cut-offs or alternative categorical metrics appeared better than the RECIST standards. One limitation of our study is the exclusion of 13% of the patients from the analysis. Most of these patients had early progression or death and were excluded to meet the conditions required for a landmark analysis [14,15], i.e., to avoid bias in favour of responders. In our study we assigned a 100% increase in the UCSTS measurement for new lesions. They were excluded from the survival analysis, but finally, only 3% of patients had PD. We acknowledge that this way of expressing data may be over-simplistic. Furthermore, as the time of response evaluation was decided by each investigator (after 2 or 3 cycles of chemotherapy) the assessments were not performed within a 2-week window. Various chemotherapy doublets were used, and most patients (69%) received at least one subsequent line of chemotherapy, which may have attenuated the association between tumour shrinkage achieved with first-line chemotherapy and OS.

Conclusion
In this large observational study of NSCLC patients given chemotherapy, UCSTS did not better perform than RECIST in predicting survival. In addition, our results suggest that distinguishing between PR and SD may be unhelpful for predicting survival of patients with metastatic disease. RECIST is easier to assess than UCSTS, which may deserve more specific assessment for trials with targeted therapies.

Competing interests
This study was funded by a 2004 grant from the national public funding agency Projet Hospitalier de Recherche Clinique, regional public funding agency Plateforme Régionale de Recherche Clinique, and scientific council of the non-profit organisation AGIR à dom. None of the authors declares any financial or moral conflicts of interest related to this study. The authors declare that they have no competing interests.
Authors' contributions ACT: 1) has made substantial contributions to acquisition of data, analysis and interpretation of data; 2) has been involved in drafting the manuscript and revising it critically for important intellectual content; 3) has given final approval of the version to be published; and 4) agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. DMS: 1) has made substantial contributions to conception and design, acquisition of data, analysis and interpretation of data; 2) has been involved in drafting the manuscript and revising it critically for important intellectual content; 3) has given final approval of the version to be published; and 4) agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. SC: 1) has made substantial contributions to acquisition of data; 2) has been involved in revising it critically for important intellectual content; 3) has given final approval of the version to be published; and 4) agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. PM: 1) has made substantial contributions to conception and design, and acquisition of data; 2) has been involved in revising it critically for important intellectual content; 3) has given final approval of the version to be published; and 4) agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. MP: 1) has made substantial contributions to conception and design, and acquisition of data; 2) has been involved in revising it critically for important intellectual content; 3) has given final approval of the version to be published; and 4) agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. NG: 1) has made substantial contributions to acquisition of data; 2) has been involved in revising it critically for important intellectual content; 3) has given final approval of the version to be published; and 4) agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. PJS: 1) has made substantial contributions to conception and design, and acquisition of data; 2) has been involved in revising it critically for important intellectual content; 3) has given final approval of the version to be published; and 4) agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. BM: 1) has made substantial contributions to acquisition of data; 3) has given final approval of the version to be published; and 4) agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. GRF: 1) has made substantial contributions to analysis of data; 2) has been involved in revising it critically for important intellectual content; 3) has given final approval of the version to be published; and 4) agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. PR: 1) has made substantial contributions to acquisition of data; 3) has given final approval of the version to be published; and 4) agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. PC: 1) has made substantial contributions to acquisition of data; 3) has given final approval of the version to be published; and 4) agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. AV: 1) has made substantial contributions to analysis and interpretation of data; 2) has been involved in drafting the manuscript and revising it critically for important intellectual content; 3) has given final approval of the version to be published; and 4) agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. EB: 1) has made substantial contributions to conception and design; 3) has given final approval of the version to be published; and 4) agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. CB: 1) has made substantial contributions to conception and design, and acquisition of data; 2) has been involved in revising it critically for important intellectual content; 3) has given final approval of the version to be published; and 4) agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. JFT: 1) has made substantial contributions to analysis and interpretation of data; 2) has been involved in drafting the manuscript and revising it critically for important intellectual content; 3) has given final approval of the version to be published; and 4) agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. All authors read and approved the final manuscript.