Cost-effectiveness of response evaluation after chemoradiation in patients with advanced oropharyngeal cancer using 18F–FDG-PET-CT and/or diffusion-weighted MRI

Background Considerable variation exists in diagnostic tests used for local response evaluation after chemoradiation in patients with advanced oropharyngeal cancer. The yield of invasive examination under general anesthesia (EUA) with biopsies in all patients is low and it may induce substantial morbidity. We explored four response evaluation strategies to detect local residual disease in terms of diagnostic accuracy and cost-effectiveness. Methods We built a decision-analytic model using trial data of forty-six patients and scientific literature. We estimated for four strategies the proportion of correct diagnoses, costs concerning diagnostic instruments and the proportion of unnecessary EUA indications. Besides a reference strategy, i.e. EUA for all patients, we considered three imaging strategies consisting of 18FDG-PET-CT, diffusion-weighted MRI (DW-MRI), or both 18FDG-PET-CT and DW-MRI followed by EUA after a positive test. The impact of uncertainty was assessed in sensitivity analyses. Results The EUA strategy led to 96% correct diagnoses. Expected costs were €468 per patient whereas 89% of EUA indications were unnecessary. The DW-MRI strategy was the least costly strategy, but also led to the lowest proportion of correct diagnoses, i.e. 93%. The PET-CT strategy and combined imaging strategy were dominated by the EUA strategy due to respectively a smaller or equal proportion of correct diagnoses, at higher costs. However, the combination of PET-CT and DW-MRI had the highest sensitivity. All imaging strategies considerably reduced (unnecessary) EUA indications and its associated burden compared to the EUA strategy. Conclusions Because the combined PET-CT and DW-MRI strategy costs only an additional €927 per patient, it is preferred over immediate EUA since it reaches the same diagnostic accuracy in detecting local residual disease while leading to substantially less unnecessary EUA indications. However, if healthcare resources are limited, DW-MRI is the strategy of choice because of lower costs while still providing a large reduction in unnecessary EUA indications.


Background
Patients with resectable advanced staged oropharyngeal squamous cell carcinoma (OPSCC) are often treated with chemoradiation (CRT) in order to preserve organ function and quality of life. Low residual and recurrent tumour rates indicate that CRT is an adequate treatment option [1]. Still, thorough follow-up is warranted to detect residual tumour, which can be successfully treated with salvage surgery if detected early. Previous research has shown that early detection of residual tumour is associated with more favourable survival probabilities and better local control [1,2]. Thus, timely detection of residual tumour is essential.
A Dutch survey showed that there is considerable variation in response evaluation after CRT, especially in the diagnostic tests performed [3]. Tests that are used are examination under general anesthesia with taking of biopsies (EUA), computed tomography (CT), magnetic resonance imaging (MRI) and 18 F-fluorodeoxy-glucose positron emission tomography combined with CT ( 18 F-FDG-PET-CT). EUA is considered to be the most reliable procedure to detect residual disease but is an invasive procedure during which biopsies are taken in areas treated with CRT. Besides the risk of side-effects such as pain, inflammation and wound healing problems, there is also the possibility of sampling error leading to falsenegative results [1]. Furthermore, EUA has considerable impact on scarce resources because hospital stay and operating facilities are required.
In contrast to EUA, imaging tests are not invasive. However, conventional CT and MRI may not be suitable because postradiation effects, e.g. fibrosis and necrosis [4], may hamper accurate interpretation of the images. More advanced tests such as 18 F-FDG-PET-CT and diffusionweighted MRI (DW-MRI) may be more suitable options. These tests do not only assess if any anatomical residual mass is present, but also determine the metabolic activity and cell density, respectively, of the tumour.
PET-CT and DW-MRI cannot completely replace EUA because pathological confirmation is required before further treatment. Nevertheless, they could be used to select those patients with a high risk of residual disease for further diagnostic workup with EUA. This would reduce the number of patients that have to undergo futile invasive diagnostic procedures. Furthermore, the imaging results provide the opportunity to guide biopsy procedures, thereby possibly reducing sampling error. Besides these advantages for patients, imaging to select patients for EUA might also lead to cost reductions. Therefore, this study aimed to assess the effects and costs of four response evaluation strategies to detect local residual disease, namely EUA for all patients, PET-CT-based selection for EUA, DW-MRI-based selection for EUA and a combination of PET-CT and DW-MRI to select for EUA. All analyses were conducted using a decision-analytic model based on trial data of forty-six patients and scientific literature.

Strategies
We developed a decision-analytic model to evaluate the effects and costs of four strategies for local response evaluation in patients with OPSCC who underwent CRT. In the first strategy under consideration, i.e. the reference strategy, all patients undergo EUA 3 months after completion of CRT for response evaluation. EUA is considered positive if histopathological proof of viable cancer has been found in the biopsy. The indication for EUA is considered unnecessary if 6 months of follow-up after response evaluation did not show residual disease.
In the three other strategies, response evaluation is performed by either PET-CT (strategy 2), DW-MRI (strategy 3) or a combination of PET-CT and DW-MRI (strategy 4). Only patients with a positive imaging test are referred to EUA. In strategy 4, patients are referred to EUA based on a combined assessment by a radiologist and nuclear medicine physician. The four strategies are depicted by decision trees in Fig. 1. Each end node reflects the diagnostic status of a patient who followed that specific branch as well as the associated costs.

Data
Costs of the different diagnostic instruments were based on official tariff lists [5]. As shown in Table 1, combined PET-CT and DW-MRI was with 1313 Euros the most expensive test. Test characteristics were obtained from a prospective trial (2012-2014) in which patients with advanced but resectable OPSCC who underwent CRT were subjected to PET-CT, DW-MRI and EUA for response evaluation 3 months after end of CRT (Dutch Trial Register: NTR4111/NL39181.000.11). Patient and tumour characteristics are summarized in Table 2. PET-CT and DW-MRI images were interpreted by respectively two nuclear medicine physicians (OH, EC) and two radiologists (JC, PG). For PET-CT, we requested the observers individually to score the PET-CT using the Hopkins criteria (a 5-point scale in which scores 1, 2 and 3 are considered negative for tumour and scores 4 and 5 are considered positive) [6]. DW-MRI observers scored signal intensity on b1000 images and the corresponding ADC-map. A residual mass was considered 'suspicious for residual disease' if increased signal intensity on the b1000 images was present, in an area of contrast-enhancement on postcontrast T1WI and corresponding to low ADC-values. For DW-MRI, we requested the observers individually to classify the MRI-examinations using a dichotomous system, i.e. suspicious for a local residue or not, based on signal intensity of the DW-images. In case readings were discrepant, consensus was reached between the observers of either PET-CT or DW-MRI. For strategy 4, combined reading of PET-CT and DW-MRI with visual correlation by a radiologist and nuclear medicine physician was performed in all patients with discrepancies between the scoring of the two modalities. In total, 46 patients were included of whom five (11%) had residual tumour. Test characteristics of each diagnostic test are shown in Table 2. Sensitivities were 60% or higher, with the combined PET-CT and DW-MRI strategy having the highest sensitivity. Specificity was lowest for PET-CT with 83%.

Analyses
Using the decision trees, we calculated the expected health effects of each response evaluation strategy, i.e. the expected proportion of correctly diagnosed patients (true positives and true negatives). Based on the costs of each diagnostic instrument, the average costs accumulated by a patient when following a specific branch in the decision tree were estimated. Subsequently, for each branch costs were multiplied by the probability that a patient will follow this branch. Summing up the expected costs of each branch of the decision tree results in the overall expected costs for that strategy. In addition, the number of EUA indications as well as the number of unnecessary EUA indications per response evaluation strategy were calculated. It was not possible to correctly estimate quality-adjusted life-years in this study. As an alternative we estimated the costs per truepositive case for the different strategies.

Sensitivity analyses
Due to the small sample size of this trial, we repeated the base-case analysis using test characteristics based on the literature. Sensitivity and specificity for PET-CT and DW-MRI were derived from respectively a systematic review by Gupta et al. (2011) [7] and the study of Vaid et al. (2017) [8]. Test characteristics for EUA and combined PET-CT and DW-MRI are not reported in the literature.
We further explored the impact of uncertainty regarding the test characteristics of the diagnostic instruments in univariate sensitivity analyses in which we increased and decreased the sensitivity and specificity of each instrument with 10% (absolute change). In the strategies in which a positive imaging test was followed by EUA, only the test characteristics of the imaging test were varied.
In addition, in clinical practice a positive EUA is usually followed by MRI and PET-CT to assess if surgical intervention is feasible and if any distant metastases are present, respectively. Therefore, in a sensitivity analysis we included the costs of these additional tests in the EUA strategy. Similarly, we added costs of an MRI or whole body PET-CT (in case of a positive EUA) to those of the "head and neck" PET-CT and DW-MRI, respectively. Finally, we assessed the impact of uncertainty surrounding the prevalence of residual tumour by increasing and decreasing the prevalence to 15% and 5%, respectively [9].

Base-case analysis
The proportion of correctly diagnosed patients and the expected costs per patient for each strategy are shown in Table 3. The combined PET-CT and DW-MRI strategy and the EUA strategy led to the highest proportion of correctly diagnosed patients, i.e. 96%. However, the expected costs of the EUA strategy were 927 Euros per patient lower than those of the combined PET-CT and DW-MRI strategy. The strategy with DW-MRI led to the lowest expected costs per patient but also to the lowest proportion of correct diagnoses. The PET-CT strategy and the combined PET-CT and DW-MRI strategy were dominated by the EUA strategy because these strategies led to a smaller or equal proportion of correct diagnoses, at higher costs. Figure 2 shows the proportion of correct diagnoses as well as the proportion of unnecessary EUA indications per strategy. All imaging strategies led to a considerably lower proportion of unnecessary EUA indications compared to the EUA strategy. The combined PET-CT and DW-MRI strategy led to the lowest proportion of unnecessary EUA indications, i.e. 38%.

Number of EUA indications and missed residual disease
In Fig. 3, the total number of EUA indications, the number of unnecessary EUA indications and the expected costs per patient are shown for 1000 patients. In the EUA strategy, all patients underwent EUA, of which 891 indications were unnecessary. All imaging strategies substantially reduced the number of required EUAs as well as the number of unnecessary EUA indications. The DW-MRI required the lowest number of EUAs, i.e. 109 EUAs of which 43 indications were unnecessary. However, this strategy also had the highest number of missed residual disease. Please note that the combined PET-CT and DW-MRI required 65 additional EUAs compared to the DW-MRI strategy, but a slightly lower proportion of these EUA indications was unnecessary.
The costs per true-positive case were for EUA 7175 Euros, for PET-CT 24,058 Euros, for DW-MRI 7588 Euros and for combined PET-CT and DW-MRI 21,387 Euros. Although EUA and DW-MRI are much cheaper, they detect less recurrences than PET-CT leading to delayed salvage treatment with potentially poorer outcome. The addition of DW-MRI to PET-CT lowered the costs per true-positive case.

Univariate sensitivity analyses
When the test characteristics of DW-MRI were based on the literature, the proportion of correct diagnoses increased from 0.93 to 0.95. On the other hand, also the expected costs and proportion of unnecessary EUA indications increased. For the PET-CT strategy, the proportion of correct diagnoses remained 0.95 whereas the expected costs and proportion of unnecessary EUA indications decreased when we based the test characteristics on the literature.
In the majority of sensitivity analyses in which we varied test characteristics, additional testing after a positive EUA and the prevalence of residual disease, the ordering of the strategies based on the proportion of correct diagnoses, expected costs and proportion of unnecessary EUA indications did not change, as shown in Table 3. Only in the analysis in which the specificity of the diagnostic instruments was lowered with 10%, the ordering of the strategies changed; the EUA strategy led to the lowest proportion of correct diagnoses.

Discussion
This study explored the cost-effectiveness of four strategies used for response evaluation to detect local residual disease after CRT in patients with advanced staged OPSCC. In the EUA strategy, i.e. the reference When healthcare resources are limited, our study suggests that DW-MRI prior to EUA is the strategy of choice. This strategy is less costly than both the combined PET-CT and DW-MRI strategy and EUA strategy. Furthermore, it has a lower impact on scarce resources such as hospital stay compared to immediate EUA. Moreover, fewer patients are exposed to invasive EUA and of the EUA indications, a lower proportion is unnecessary. However, the proportion of correct diagnoses is lower than in the EUA strategy. This difference is caused by a higher proportion of false-negative test results meaning that residual tumour is missed.
Besides decreasing the number of unnecessary EUA indications, another possible advantage of imaging is that the results of the imaging test can be used to guide the biopsy taking. This may increase the sensitivity of EUA. As the surgeons in the trial were not blinded to the imaging results, it may be possible that the sensitivity of EUA without prior imaging is overestimated. We assessed the impact of this in sensitivity analyses by assuming a 10% lower sensitivity in the EUA strategy. This changed the ordering of the strategies; the combined PET-CT and DW-MRI strategy became the strategy with most correct diagnoses. Nevertheless, EUA detected only 60% of the residual tumours, questioning the degree of bias.
The test characteristics of PET-CT and DW-MRI were based on a trial in which all patients received PET-CT, DW-MRI and EUA 3 months after CRT. Timing is an important determinant of these test characteristics. If the test is conducted too soon after treatment, postradiation effects can lead to a false-positive test. Also false-negative test results are possible because tumour cells have not yet reached a detectable size. On the other hand, early diagnosis of residual disease increases treatment success of salvage surgery [11]. A study indicated that DW-MRI may be used for response evaluation 3  weeks after CRT [12]. The optimal timing of imaging has still to be determined, which might influence test characteristics. Furthermore, DW-MRI is a relative new imaging technique within oncological applications and therefore, little experience is gained with DW-MRI in the post-treatment evaluation so far. Besides, DW-imaging is susceptible to artefacts, particularly in the inhomogeneous head and neck area which contains a variety of tissues. Also, geometrical distortions due to interfaces between soft tissue and air or bone can occur. Although DW-MRI is not yet an established technique for response evaluation, this early assessment of the expected health effects and costs provides more insight in the potential of DW-MRI as a response evaluation strategy. We showed that DW-MRI is the strategy of choice when combined with PET-CT. However, a response evaluation strategy solely based on DW-MRI followed by EUA in individuals with a positive imaging test is only preferred in settings with limited health care resources. Nevertheless, radiologists are still in the learning curve concerning post-treatment evaluation of DW-MRI. With more training and feedback, DW-MRI is expected to obtain a higher accuracy. The impact of increased sensitivity was assessed in sensitivity analyses, but conclusions did not change.
Another emerging technique for response evaluation is PET-MRI. This technique combines the often complementary data from PET and MRI and could lead to improved anatomic localisation of focal uptake compared to PET-CT [13]. This could potentially decrease the number of false-positive test outcomes and as a consequence, reduce the number of unnecessary EUA indications.
The trial included only 46 patients of whom five were diagnosed with residual disease. Due to this small sample size, there was a fair amount of uncertainty regarding test characteristics and the prevalence of residual disease. Repeating the base-case analysis using test characteristics derived from the literature did not change our conclusion. Furthermore, our residual primary tumour rate of 11% was in agreement with the results of Moeller et al. (2009) [14]. On the other hand, Van den Broek et al. (2006) reported a residual primary tumour rate of 7% [1]. We have addressed this issue by varying the prevalence of residual disease in one-way sensitivity analyses. However, the ordering of the strategies based on correct diagnoses, costs and unnecessary EUA indications did not change.
In this cohort of patients, the residual primary tumour rate was only 11%. To improve the yield of routine response evaluation, only patients with high risk factors should undergo this diagnostic procedure. Risk factors which can be used to select patients include T-stage [15], HPV (human papilloma virus) status [16,17] and pre-treatment metabolic tumour volume [18].
We did not differentiate between HPV-related and HPVunrelated tumours. For HPV-related tumours, a lower residual disease rate is observed which is probably due to higher responsiveness to chemoradiation [19,20]. However, it is unclear whether HPV-related tumours have the same probability of being detected as HPV-unrelated tumours. In this study, 44% of the tumours were HPV positive whereas  all the patients with residual disease had a HPV negative tumour. Nevertheless, the small sample size precludes definite conclusions regarding differences in detection.
Outcomes of this study were the proportion of correctly diagnosed patients, costs concerning diagnostic instruments and the number of unnecessary EUA indications. This means that health benefits, i.e. the proportion of correct diagnoses, and treatment burden, i.e. unnecessary EUA indications, were evaluated separately. Moreover, not all health benefits and treatment burden were captured in these outcomes. For example, declined survival probabilities due to false-negative test results as well as side-effects of EUA were not taken into account. Outcomes that encompass all health benefits and treatment burden such as quality-adjusted life-years (QALYs) would therefore be preferable. Comparing costs per QALY would lead to a more comprehensive evaluation of the response evaluation strategies. However, it was not possible to calculate QALYs due to the trial design. Patients in the trial were subjected to all tests for response evaluation meaning that residual disease which may be missed by one test (false-negative) could be detected by one of the other tests. If patients would be subjected to only one test, as in the strategies evaluated in this study, patients with false-negative test results would have become symptomatically detected at a later point of time. Since earlier detection leads to improved survival probabilities [1,2], it was not possible to correctly estimate QALYs. As an alternative, we calculated costs per true-positive case for the different strategies.
We hypothesize that the ordering of strategies would not change when the evaluation was based on costs per QALY. A strategy using an instrument with a high sensitivity and few (unnecessary) EUA indications would be favoured because this strategy would lead to low rates of missed residual disease and low treatment burden. This means that the combined PET-CT and DW-MRI strategy would still be the strategy of choice. To assess this hypothesis, future studies should include costs per QALY.
To our knowledge, few studies have assessed the costeffectiveness of response evaluation in patients with head and neck cancer treated with CRT. Two studies compared a strategy of PET-CT scanning prior to neck dissection and up-front neck dissection for all patients. These studies showed that the imaging strategy was more cost-effective [9,21]. Although these studies did not compare imaging to EUA, they also indicate that imaging can be more costeffective than more invasive strategies.
In clinical practice, PET-CT can be used for response evaluation at the primary site and neck and to detect distant metastases simultaneously. Previous studies have shown that PET-CT is cost-effective for response evaluation after (chemo)radiation of the pretreatment advanced stage positive neck when only patients with a PET-CT positive neck underwent neck dissection compared to planned neck dissection in all patients [21,22]. Moreover, pretreatment screening for distant metastases using PET-CT appeared also to be cost-effective [23]. These evaluations by PET-CT add to the cost-effectiveness of response evaluation of the primary site as reported in the present study. For DW-MRI no data on costeffectiveness in response evaluation of neck disease is available. Also for response evaluation of advanced nodal neck disease the combination of PET-CT and DW-MRI seems to have the highest sensitivity and specificity [24].
Results of cost-effectiveness studies can be used for the development of a guideline for response evaluation in patients with OPSCC. The need for such a guideline is underlined by a previous study showing that there is substantial variation in the diagnostic tests used for response evaluation [3]. By including studies on costeffectiveness in the guideline development process, guideline recommendations will not only be based on the most effective strategy, but on the most cost-effective strategy. This will lead to more sensible use of scarce healthcare resources. This is the first cost-effectiveness study evaluating different response evaluation strategies. However, there was considerable uncertainty regarding important model parameters due to the small sample size of the trial. Additional studies, preferably based on trials with a larger sample size, are required to provide a comprehensive evidence base for guideline development.

Conclusions
This study suggests that combined PET-CT and DW-MRI is the strategy of choice for local response evaluation. It is preferred over immediate EUA since it costs only an additional €927 while reaching the same diagnostic accuracy and leading to substantially less unnecessary EUA indications and its associated burden. However, if healthcare resources are limited, DW-MRI is the preferred strategy because of lower costs while maintaining the number of unnecessary EUA indications low.