Comparative survival benefit of currently licensed second or third line treatments for epidermal growth factor receptor (EGFR) and anaplastic lymphoma kinase (ALK) negative advanced or metastatic non-small cell lung cancer: a systematic review and secondary analysis of trials

Background A review of therapies for advanced cancers licenced by the EMA between 2009 and 2013 concluded that for more than half of these drugs there was little evidence of overall survival or quality of life benefit. Recent years have witnessed a growing number of licensed second-line pharmacotherapies for advanced/metastatic non-small cell lung cancer (NSCLC). With the aim of gauging patient survival benefit, we conducted a systematic review of randomised controlled trials (RCT) and compared survival outcomes from available licensed treatments for patients with advanced/metastatic NSCLC. Methods RCTs of second/third line treatments in participants with advanced/metastatic NSCLC and negative/low expression of Anaplastic Lymphoma Kinase (ALK) and of Epidermal Growth Factor Receptor (EGFR) were included. We searched electronic databases (MEDLINE; EMBASE; Web of Science) from January, 2000 up to July, 2017. Two or more independent reviewers screened bibliographic records, extracted data, and assessed risk of bias of studies. Published Kaplan Meier plots for OS and PFS along with restricted-mean-survival methods and parametric modelling were used to estimate the survival outcomes as mean number of months of survival. Network meta-analysis was undertaken to rank interventions and to make indirect comparisons. Results We included 11 RCTs with data for 7581 participants that compared nine different drugs. In studies of patients regardless of histology groups, targeted drugs (ramucirumab and nintedanib) yielded small overall survival gains of < 2.5 months over docetaxel, erlotinib provided no benefit, while immunotherapies (atezolizumab and pembrolizumab) delivered 5 to 6 months gain. Studies with patients stratified by histology confirmed the apparent superiority of immunotherapy (nivolumab and atezolizumab) over targeted treatments (ramucirumab, nintedanib, afatinib) providing between about 4 to 8 months OS gain over docetaxel. In network analysis immunotherapies consistently ranked higher than alternatives irrespective of population histology and outcome measure. Conclusion Our review indicates that nivolumab, pembrolizumab and atezolizumab provide superior survival benefits compared to other licensed drugs for late stage NSCLC. Patient gains from these immunotherapies are substantial compared to the expected average survival with chemotherapy (docetaxel) of < 1 year for people with squamous histology and about 1.25 year for those with non-squamous histology. Electronic supplementary material The online version of this article (10.1186/s12885-019-5507-6) contains supplementary material, which is available to authorized users.


(Continued from previous page)
Results : We included 11 RCTs with data for 7581 participants that compared nine different drugs. In studies of patients regardless of histology groups, targeted drugs (ramucirumab and nintedanib) yielded small overall survival gains of < 2.5 months over docetaxel, erlotinib provided no benefit, while immunotherapies (atezolizumab and pembrolizumab) delivered 5 to 6 months gain. Studies with patients stratified by histology confirmed the apparent superiority of immunotherapy (nivolumab and atezolizumab) over targeted treatments (ramucirumab, nintedanib, afatinib) providing between about 4 to 8 months OS gain over docetaxel. In network analysis immunotherapies consistently ranked higher than alternatives irrespective of population histology and outcome measure. Conclusion: Our review indicates that nivolumab, pembrolizumab and atezolizumab provide superior survival benefits compared to other licensed drugs for late stage NSCLC. Patient gains from these immunotherapies are substantial compared to the expected average survival with chemotherapy (docetaxel) of < 1 year for people with squamous histology and about 1. 25 year for those with non-squamous histology.

Background
Lung cancer is the second most common cancer in both men and women. [1] It is the leading cause of cancer death in both men and women. Excluding mesothelioma, non-small cell lung cancer (NSCLC) accounts for about 85% of all lung cancers. [2] Many patients have a delayed diagnosis and are unsuitable for surgery so that most receive some form of first line pharmacotherapy. In the past, following failure of first-line therapies most NSCLC patients received docetaxel [3], however in recent years targeted therapies and immunotherapies have been developed, the latter acting as immune checkpoint inhibitors with the aim of boosting anti-tumour immunity rather than directly targeting cancer cells. About 12 agents now have a label indication for second-or further line NSCLC treatment. A 2017 study [4] of cancer drugs approved by the EMA from 2009 to 2013 concluded that most of these drugs entered the market without evidence of benefit on survival or quality of life, and that after a median of 3.3 years post-market entry, there was little or no conclusive evidence of extended or better life for most cancer indications.
The effectiveness of the second line new agents for treating NSCLC in absolute terms is unknown because previous trial analyses focused mostly on the relative benefit (versus standard chemotherapy mainly consisting of docetaxel), usually expressed in terms of OS and PFS hazard ratios [5,6].
In this systematic review we estimated the survival benefit (i.e. mean number of months) from licensed therapies for NSCLC. It is hoped that the findings focused on new drugs may contribute to more informed discussion between patients and clinicians and will support the decision-making process.

Methods
We registered a protocol for this review in PROSPERO (CRD42017065928).

Inclusion/exclusion criteria
We included RCTs of adult patients with advanced or metastatic (IIIB and/or IV) NSCLC with non-squamous (adenocarcinoma, large cell) or squamous histology who had experienced failure to prior first line chemotherapy (i.e., those receiving second line treatment and beyond); had either predominantly negative or 100% negative expression of anaplastic lymphoma kinase (ALK); had either predominantly negative or 100% negative expression of epidermal growth factor receptor (EGFR). Studies enrolling only patients with ALK+ and/or EGFR + expression were excluded since according to current practices they would be offered targeted therapies (erlotinib or gefitinib for EGFR+; osimertinib for EGFR T790 M; crizotinib or ceretinib for ALK+). [1] RCTs were included if interventions or comparators had an EMA (European Medicines Agency) label indication as of June, 2017 for the population described above. The drugs meeting these criteria were Docetaxel (DOC), Pemetrexed (PEM), Ramucirumab plus docetaxel (RAM + DOC), Erlotinib (ERL), Nintedanib plus docetaxel (NIN + DOC), Afatinib (AFA), Nivolumab (NIVO), and Pembrolizumab (PEMBRO). We also included Atezolizumab (ATEZO) which obtained an EMA license following the Committee for Medicinal Products for Human Use (CHMP) positive opinion of 20 July 2017. Only studies in which drugs were used with a dose regimen as described in the summary of product characteristics were included. The following drugs such as Crizotinib, Ceretinib, Gefetinib, Osimertinib which are used in people with ALK+ and/or EGFR+ expression were excluded.
Studies were included if either an overall survival or progression-free survival or both parameters were reported in published Kaplan-Meier plots.

Search strategy
Electronic databases (MEDLINE; EMBASE; Web of Science) were searched for relevant literature from January, 2000 up to present (see MEDLINE search strategy in Additional file 1).
The electronic searches were limited to English language. The lower time limit for the search period was chosen in accordance with the emergence of docetaxel as the standard second-line treatment. Reference lists of relevant articles were hand-searched to identify additional potentially relevant citations. The search was first updated up to early July 2017 retrieving 274 additional records but no further studies were included. A final update of the search was undertaken up to February 2019 to identify additional original articles relevant to the included studies. The latter retrieved 651 records of which six were selected for further scrutinity.

Selection of studies
Three reviewers independently screened all titles/abstracts and then full texts of publications potentially relevant for inclusion. Disagreements were resolved through a consensus. The study flow and reasons for exclusion at the full text screening level are presented in the PRISMA study flow diagram [7] (Additional file 2).

Data extraction
The data extracted included study author, trial acronym, patient characteristics (age, sex, diagnosis, tumour stage/ histology), type, mode, dose and duration of treatments. Extracted data was cross-checked by a second reviewer.
Published Kaplan-Meier (KM) survival plots were used to make estimates of mean survival benefit. Two reviewers digitised the KM plots, extracted patient numbers at risk, numbers of events, and published hazard ratios.

Assessment of risk of bias
Two independent reviewers assessed the risk of bias (RoB) in the included studies using the Cochrane RoB tool for RCTs; [8] this categorises studies according to the following domains of potential bias: selection bias (random sequence generation, allocation concealment), performance bias (blinding participants and personnel), detection bias (blinding of outcome assessment), attrition bias (incomplete outcome data), reporting bias (selective outcome reporting), and "other" bias (e.g. between-group baseline distribution of important prognostic factors). Summary ratings of high RoB were assigned if at least one of the domains of selection, attrition, and other bias was rated as high RoB. If information was insufficient to judge, then an unclear RoB rating was assigned. Quality assessment was performed by two independent reviewers and then cross-checked. Any disagreements were resolved by a third reviewer through a discussion.

Data analysis and synthesis
We used the algorithm of Guyot et al. [9] to estimate underlying individual patient data, which was then used to reconstruct KM plots and to derive estimates of mean survival. The reliability of KM reconstructions was tested by inspection of reconstructions overlaid onto published plots, comparison of reconstructed and published risk table of patients at risk, and correspondence of reconstructed HRs with published HRs (Additional file 3).
Mean survival was estimated in several ways. Restricted mean survival (RMS) [10] and mean difference in RMS between compared drugs in each trial, were estimated to the longest time common across the compared studies of interest using the Stata module of Cronin et al. 2016 [11].
In order to account for any potential gains beyond the longest observation time common across trials, we undertook analysis of total mean survival using parametric survival modelling. Total mean survival was estimated: [a] with Weibull models (fit separately by study arm) using the stgenreg package of Crowther and Lambert 2013 [12]; mean survival time and 95% confidence intervals (CIs) were estimated from the AUC of the model and its upper and lower 95% CIs using 0.01 month increments over 96 months. The CIs around the central AUC estimate were somewhat asymmetric (as would be expected from the delta method for estimating CIs around parametric models). The SE for the AUC value was therefore estimated from the difference between 95% LCI and UCI AUC values divided by 2 × 1.96. In two instances Weibull models were inferior to generalised gamma models in which case the latter were used; [b] Total mean survival was also calculated using the equations for mean survival published by Davies et al. 2012 [13] for Weibull parametric survival models; [c] Lastly, total mean survival was also estimated using the "stci, emean" command in Stata; this command uses an exponential extension from the tail of the KM plot to the time axis; and mean survival is then estimated from the AU the KM plot plus that under the extension. Similar methods were applied for progression free survival (PFS) (Additional file 4).
We did exploratory analyses to investigate the relationships between PFS and RMS and modelled total survival, and between published hazard ratios and median survival values and RMS and modelled total survival.
The outcome estimates are presented in KM plots, model plots, forest plots, and tables. Where possible, the analyses were stratified by histologic subtypes (squamous and non-squamous).
For completeness, we undertook a network meta-analysis to estimate the mean differences in RMS and in OS. The description of corresponding methods was reported as Additional file 7.

Study characteristics and quality
The 11 RCTs compared nine different drugs with the majority of comparisons were against DOC. Two comparisons, ATEZO vs DOC [17,24] and NIVO vs DOC [14][15][16] were tested in more than one study. The NIVO studies employed histology-specific inclusion criteria. Table 1 summarises the main characteristics reported for the 11 studies. Study sample size ranged from 208 to 1314 patients; studies included predominantly people with stage IV NSCLC and performance status 1. The mean age at inclusion ranged from 57 to 66 years and the majority of patients were male. There was no evidence of substantial imbalance in potential effect modifiers.
Nine studies [15-18, 20-22, 24, 26] were considered as high-risk of bias due to the lack of blinding of participants and personnel. The five RCTs [15-17, 21, 24] evaluating checkpoint inhibitors versus DOC were open-label and were considered as high-risk due to performance bias. LUME-LUNG-1 [23] was rated at low risk of bias for all the key domains. Only HORG and TAILOR [18,22] had public funding, so the remaining studies were rated as high-risk due to "other source bias".

Overall survival analyses in mixed histology populations
These analyses were based on mixed populations of patients whose tumour histology was either squamous or non-squamous.

Overall survival from observed data
Reconstructed KM plots from studies reporting OS in populations unselected according to tumour histology are shown in Fig. 1 (for completeness of analysis corresponding plots for PFS are presented in Additional file 4). Only the plots for ATEZO (OAK [24] and POPLAR [17] trials) and for PEMBRO (KEYNOTE-010 [21]) imply appreciable survival gains over DOC. ERLO was not beneficial compared to DOC (TAILOR [18]) or PEM (HORG trial, [22]).

Overall survival from extrapolated data (survival modelling)
Exponential extrapolation from the tail of the KM plots (Stata command: stci, emean) suggests losses for ERLO relative to DOC, and gains over DOC of less than 1 month for RAM+DOC and NIN + DOC (the latter licensed only for adenocarcinoma), and potentially impressive gains over DOC of 7.9 to 8.5 months for PEMBRO and ATEZO respectively (Table 2). However, the alternative procedure of modelling OS using Weibull fits to the whole of the KM plot suggests more modest gains for immunotherapies relative to DOC ( Fig. 2 and Table 2). Across industry-sponsored studies of immunoand targeted therapies Weibull models of overall survival ( Fig. 3) with DOC yielded between 11.10 months (95% CI: 9.98-12.88) (KEYNOTE-010 [21]) and 13.59 months (95% CI: 12.11-15.32) (OAK), and suggest mean survival gains over DOC of 5.74 months (95% CI minus 0.14-11.61) and 5.34 months (95% CI 2.25-8.43) for ATEZO (POPLAR [17] and OAK [24] respectively), 5.04 months (95% CI 1.57-8.52) for PEMBRO (KEYNOTE-010), but of less than 2 months for targeted therapies RAM+DOC (REVEL [19]) and NIN + DOC (LUME LUNG-1 [23]), and no gain for ERLO. Weibull modelling of the publicly funded HORG trial indicated a possible modest gain from ERLO over PEM (1.16 months; 95% CI: minus 3.5-5.82), and modelling of the Hanna study indicated likely equivalence of the chemotherapies DOC and PEM.
Overall survival analyses per histology (squamous or nonsquamous) These analyses were based on the studies where KM plots for trial participants stratified according to histology were presented.  Mean survival from observed data Figure 4 summarises the reconstructed KM plots for licenced drugs for squamous histology and non-squamous histology. These suggest likely modest gains from RAM +DOC irrespective of histology and for NIN + DOC in the treatment of adenocarcinoma (the licensed indication), little or no gain from PEM over DOC irrespective of histology, but more substantial likely gains over DOC from the checkpoint inhibitors (NIVO and ATEZO) for both histology types. No KM plots per histology were available for PEMBRO. Over the observed periods of 24 and 27 months common to all squamous and non-squamous studies respectively, NIVO and ATEZO delivered between about 2 and 4 months RMS gain over DOC, while RAM+DOC and NIN + DOC only between 1 and 2 months, results supporting the apparent superiority of the checkpoint inhibitors (Tables 3 and 4, Additional file 6).  Overall survival from extrapolated data (survival modelling) Weibull models provided satisfactory fits for non-squamous histology but the shapes of the KM plots for squamous histology for the checkpoint agents were irregular and gamma models provided a better fit. Parametric models are summarised in Fig. 5. For the industry-sponsored studies of targeted and immunotherapies Weibull model estimates of mean survival with DOC treatment in patients with squamous histology ranged between 9.41 (95% CI: 7.78-11.41) months (CHECK-MATE-017) and 11.73 (95% CI: 10.131-13.38) months (LUME LUNG-1), and in patients with non-squamous histology between 13.32 (95% CI: 11.73-15.8) months (CHECK-MATE-057) and 15.02 (95% CI: 13.05-17.43) months (OAK) (Tables 3 and 4, and Fig. 3). The gain in overall mean survival over DOC from targeted and immunotherapies  In network analysis immunotherapies consistently ranked higher than alternatives irrespective of population histology and outcome measure (Additional file 7).

Exploratory analyses on PFS and OS relationships
PFS is often specified as a primary or co-primary outcome in trials of cancer drugs. We conducted analyses to explore if PFS in NSCLC might be an indicator for overall survival in second line therapies. Weibull model estimates of gains in PFS over DOC for targeted therapies and immunotherapies were modest ranging from + 1.18 months (RAM + DOC) to minus 1.33 months (ERLO) in studies recruiting patients unrestricted by histology (Additional file 4). Available data for squamous and non-squamous histologies indicated similarly small gains except in the case of CHECKMATE-017 (squamous histology) in which the estimated gain was more substantial (3.11 months). Across the included studies OS overall survival, RMS restricted mean survival, R_mSext restricted mean survival exponentially extended from the end of the KM plot, Mean total OS Weibull formula mean OS estimated from Weibull model parameters using the formula published by Davies et al. [13] there was a poor relationship between modelled estimates of PFS and of OS, and between modelled PFS gains and reported PFS hazard ratios, whereas strong associations were seen between modelled OS and reported median OS, and between modelled OS gains and reported OS hazard ratios (Additional file 8). These finding suggests that PFS is unlikely to be a good indicator for subsequent OS in this case.

Discussion
In this study we estimated the mean number of months of survival benefit from therapies licensed for the treatment of advanced NSCLC. An estimation of survival in the absence of treatment can be obtained from two early RCTs in NSCLC patients, previously treated with platinum chemotherapy, and who were randomised to receive placebo or best supportive care (BSC). The reported median survivals were 4.7 [27] and 4.6 months (95% CI: 3.7-6.0) [28] respectively. By applying the methods described above using Weibull models, we estimate BSC and placebo mean survival to be 7.34 months (95% CI: 5.92-9.14) and 7.77 months (95% CI: 6.71-9.03). If patients received DOC they might expect an extension in average life expectancy to about a year depending on histology, with slightly better prospects for those with non-squamous histology. Our results suggest that mean survival gains over DOC from RAM+DOC, NIN + DOC, and ERLO, are meagre but may be marginally superior for patients with non-squamous than for those with squamous tumour histology. The analysed results indicate that average survival gains over DOC from checkpoint inhibitors are greater than those from the targeted therapies, with estimates for the former reaching between 4 and 9 months depending on tumour histology and the method of modelling beyond the observed data.
The European Society for Medical Oncology (ESMO) has recognised that a comparison of treatments based solely on hazard ratios for OS provide only indirect information about treatment benefit; they have proposed  [13] an estimator, the "Magnitude of Clinical Benefit Scale (ESMO-MCBS)", which they believe represents "a standardised, generic, validated approach to stratify the magnitude of clinical benefit that can be anticipated from anti-cancer therapies" [29]. Davis et al. 2017 [4] used this tool to examine pharmaceutical interventions for advanced cancers approved by the EMA 2009 to 2013. The authors expressed concern that for many of these interventions, the available evidence failed to demonstrate survival benefit or improved patient quality of life.
Patients may have difficulty in interpreting measures of relative risk (e.g. HRs) and in understanding the basis of the ESMO-MCBS tool measure. Patients often prefer information about the likely lifetime gain (e.g. life years gained) from a new treatment being offered. It has been suggested that advanced cancer patients with short life  Fig. 3. Time axis is months, vertical axis is proportion alive. All are Weibull models except where specified expectancy are willing to accept considerable toxicity of treatments that offer a chance of durable survival [30], however evidence on this is conflicting [31]. It has been claimed that a proportion of patients who receive immunotherapies may experience a durable survival response (a so called "tail of the curve response") so that mean survival estimates for the "whole population" may mask this possibility. However, the evidence base for such outcomes is far from clear cut.
The British Thoracic Society has provided guidance for health care professionals about sharing information with patients with lung cancer. [32] Such information could include estimates of average survival benefit that might accrue with various treatment options. Furthermore decision makers such as NICE generally require estimates of the mean gain in survival from new treatments when taking reimbursement decisions. It is therefore of interest to gain an idea of the mean survival benefit yielded from new treatments for advanced NSCLC and to see how such benefit might vary according to tumour histology.
Equally importantly, mean survival gains offer an unambiguous, informative measure of outcome, which is far less exposed to limitations and controversies surrounding the use of quality-adjusted life years (QALYs) in the evaluation of treatments for neoplasms. While the QALYs facilitates decision making across areas, limitations in the way QALYs are constructed have led to criticism on various grounds (including insensitivity to changes in health states [33], especially those that are caused by adverse effects due to cancer treatments [34]. Limitations in QALYs have led researcher to conclude that 'the measure shows important limitations in terms of its ability to accurately capture the value of the health gains deemed important by cancer patients' [33]. Reimbursement decisions become challenging when comparing cancer, with its' generally short term survival expectation, with chronic disabling diseases with relatively extended absolute survival. Given this, we expect that estimates of survival are key information which, at the very least, should be reported and considered alongside QALYs.
Our review has several strengths. To the best of our knowledge, this is the first attempt at comparing mean survival of all drugs with a licensed indication for second/third line treatment of advanced/metastatic wild-type NSCLC. It is justified, because the growing number of licensed therapies offers a new range of treatment options for which survival information is of interest to both oncologists and patients. Multi-arm RCTs could provide the best evidence, but these have not been undertaken and our work provides a pragmatic approach.
Our review has several limitations. Although we used rigorous methods to identify all relevant literature we could only include 11 primary research studies so the inherent risk of publication bias may be of particular importance. Our survival curves and estimates have relied on reconstructing the underlying individual patient data rather than using the individual patient data itself. However for all the included studies, there was a close correspondence between our derived curves and those published. A further potential limitation is the risk of uneven performance of the common comparator, DOC, between different studies; however these differences were small relative to differences between targeted therapies and the checkpoint inhibitors. We noted some differences in baseline characteristics across studies regarding the number of prior lines of treatment and disease stage at inclusion. For these variables survival outcomes were not reported in sufficient detail to allow sensitivity analyses to test the robustness of our results. author(s) and not necessarily those of the NHS, the NIHR or the Department of Health. X.A., A.C., M.C., LA and P.R. have been commissioned by the NIHR HTA Programme to undertake reviews and evidence synthesis on the clinical and cost-effectiveness of health care interventions for a range of research funders and policy makers, including the National Institute for Health and Care Excellence (NICE). The views expressed in this paper are those of the authors and not necessarily those of the NIHR HTA Programme. Any errors are the responsibility of the authors. We thank Professor Norman Waugh for stimulating our interest in this topic.

Funding
None.
Availability of data and materials This paper reports secondary analyses based on previously published original data referenced in the paper's bibliography. The estimates of individual patient data derived from these published sources are available from the authors on reasonable request, as also are the parametric models based on these estimations.
Authors' contributions XA, MC and GJM-T conceived and designed the study. XA, MC, and AT reviewed studies extracted and analysed data, PR devised and undertook the searches and wrote sections of the manuscript, LA revised the manuscript and contributed to substantial improvements in the discussion section. AC project managed the investigation, all authors contributed to writing the submitted version of the manuscript.
Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable.