Minimally important differences for the EORTC QLQ-C30 in prostate cancer clinical trials
BMC Cancer volume 21, Article number: 1083 (2021)
The aim of the study was to estimate the minimally important difference (MID) for interpreting group-level change over time, both within a group and between groups, for the European Organisation for Research and Treatment of Cancer Quality of Life Questionnaire Core 30 (EORTC QLQ-C30) scores in patients with prostate cancer.
We used data from two published EORTC trials. Clinical anchors were selected by strength of correlations with QLQ-C30 scales. In addition, clinicians’ input was obtained with regard to plausibility of the selected anchors. The mean change method was applied for interpreting change over time within a group of patients and linear regression models were fitted to estimate MIDs for between-group differences in change over time. Distribution-based estimates were also evaluated.
Two clinical anchors were eligible for MID estimation; performance status and the CTCAE diarrhoea domain. MIDs were developed for 7 scales (physical functioning, role functioning, social functioning, pain, fatigue, global quality of life, diarrhoea) and varied by scale and direction (improvement vs deterioration). Within-group MIDs ranged from 4 to 14 points for improvement and − 13 to − 5 points for deterioration and MIDs for between-group differences in change scores ranged from 3 to 13 for improvement and − 10 to − 5 for deterioration.
Our findings aid the meaningful interpretation of changes on a set of EORTC QLQ-C30 scale scores over time, both within and between groups, and for performing more accurate sample size calculations for clinical trials in prostate cancer.
While the importance of assessment of patient-reported outcomes (PROs) to measure health-related quality of life (HRQOL) in cancer clinical trials is no longer an issue of debate, difficulties in understanding the meaningfulness of resulting scores [1,2,3] remain a barrier for using them to their full potential. Statistical significance of observed differences and changes does not necessarily equate to clinical relevance nor does it reflect the importance of that difference or change for a patient. The concept of minimal important difference (MID) as “the smallest difference in score in the outcome of interest that informed patients or informed proxies perceive as important, either beneficial or harmful, and which would lead the patient or clinician to consider a change in the management”  is but one important component in the provision of an interpretation framework which allows putting PRO results into perspective. The MID of an instrument transforms the metric of the score into a clinical experience which not merely makes score changes actionable on a patient level, but may provide decision thresholds in testing the relative efficiency of treatments and inform the calculation of required sample sizes and numbers needed to treat (NNT) [4, 5].
There are different ways for determining MIDs, the division in anchor-based and distribution-based methods being an overall methodological classification. Anchor-based methods link PRO scores to external criteria of clinical relevant change, such as patient or clinical ratings, whereas distribution-based methods only consider the statistical distribution of the scores, e.g. defining an MID as a change larger than a pre-defined variation of measurement error .
Both methods have their strengths and their weaknesses. The anchor-based approach for instance strongly relies on the selection of appropriate anchors which may vary between conditions and settings. The distribution-based approach lacks a patient or clinical perspective and clinically relevant changes might be much more sample dependent.
As King et al. (2011)  highlight, there is no universal MID but rather a set of MIDs for instruments and scales, different conditions and clinical settings and, furthermore, distinction needs to be made between guidelines for group-level and individual-patient level interpretation of PRO scores. Therefore, it is recommended that MID selection should not rely solely on a rule of thumb, but must take into account the pre-specified research question at hand and knowledge of existing MIDs applicable to study specific instrument or scale.
Being the most frequently used HRQOL measure in cancer research  for the European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Questionnaire Core 30 (QLQ-C30) a number of MID estimates have been provided. These include both, anchor-based approaches using patient ratings  and clinical variables as anchors  as well as distribution-based methods  using data pooled across studies and cancer sites. Acknowledging that MIDs might differ across scales, direction of change (improvement vs deterioration) and cancer sites, an ongoing EORTC project aims at expanding the portfolio of QLQ-C30 MIDs by adding MIDs for each scale for different cancer sites . Here we focus on prostate cancer, which currently accounts for 21% of cancers in men in the US . Localized prostate cancer may be cured with surgery or radiation therapy, while in advanced disease hormonal, chemotherapeutic, and radionuclide therapies target the delay of progression and the palliation of symptoms. The two disease situations entail different conglomerates of symptoms and HRQOL issues associated with either disease or treatment or both. Problems with urinary function is most frequently observed in patients with prostatectomy, bowel problems have been linked to radiation therapy, and problems with sexual function have been associated with surgical procedures, hormonal therapy, as well as with the disease itself. The extent of psychological distress imposed by symptoms and the impact of the disease on general HRQOL and functioning aspects differ across patient groups . Considered that there is a lack of clear consensus on optimal treatment strategy in many curative and palliative clinical situations with regard to survival [15, 16], HRQOL parameters are essential in future treatment studies and in clinical decision making. To support the use of HRQOL outcomes in prostate cancer research and to improve the interpretation of HRQOL scores in this population we here present the following QLQ-C30 MIDs for this patient group: (1) MIDs for within-group change in HRQOL scores over time and (2) MIDs for between-group differences in HRQOL change over time.
Data were derived retrospectively from two EORTC phase III trials in prostate cancer. Trial 1 (EORTC 22961) evaluated long term or short term androgen suppression combined with irradiation in locally advanced prostate cancer . Trial 2 (EORTC 22991) compared the effectiveness of radiation therapy with or without bicalutamide and goserelin in treating patients who have localized prostate cancer . Both trials collected HRQOL longitudinally using the EORTC QLQ-C30.
The EORTC QLQ-C30
The EORTC QLQ-C30 consists of 30 questions that form 15 scales, 5 of which are functioning scales (physical, role, emotional, social, and cognitive), 9 are symptom scales (fatigue, nausea and vomiting, pain, dyspnoea, insomnia, appetite loss, constipation, diarrhoea and financial difficulties) and one is a global health status/QoL scale. Trial 1 used version 2 of the EORTC QLQ-C30, whereas trial 2 used version 3. The two versions differ only in the response categories of questions 1–5, coded as yes/no in version 2, whereas in version 3 responses are provided on a four-point Likert scale from ‘not at all’ to ‘very much’ for all questions with the exception of the global health status and quality of life which are rated from 1 ‘very poor’ to 7 ‘excellent’. Scoring was done according to the scoring manual , with the means of the raw scores for each scale transformed to fall between 0 and 100. For consistency in signs, all scales were scored such that 0 represents the worst possible score and 100, the best possible score. The financial impact scale was omitted from the analysis.
For each EORTC QLQ-C30 scale we selected several anchor from clinical variables (e.g. WHO performance status (PS)) that were available from the data sets were selected. This was done using cross-sectional correlations (either polyserial or polychoric correlation to ensure acceptable correlation of ≥|0.3|) between the scales and the anchors . It was aimed at using several anchors for each EORTC QLQ-C30 scale to provide some assurance about the plausibility of the estimated MIDs. Clinical input was provided by a panel of four prostate cancer / HRQOL experts to assure clinical plausibility of statistically selected anchors. Please refer to Musoro et al.  for details on the anchor selection methodology.
Definition of clinical change groups
As described in earlier publications on the project [12, 20,21,22,23] the three clinical change groups (CCGs) defined by an expert panel were: (i) deterioration (worsened by 1 anchor category), (ii) stable (no change in anchor category) and (iii) improvement (improved by 1 anchor category). Patients changing by ≥2 points in anchor categories were considered to have changed more than just “minimally” and hence were excluded from MID estimation.
Analysis have been described in more detail in previous publications [12, 20,21,22,23]. In overall two approaches to MID estimation have been applied, the anchor based and the distribution based approach.
For the anchor-based approach change scores for each scale and anchor pair were computed across all pairwise time points and MIDs for improvement and deterioration were estimated by calculating the mean HRQOL change score of patients classified as improved and deteriorated respectively (within-group MIDs). To estimate between-group MIDs (i.e. the differences in change over time between two groups of patients) linear regression models were fitted, one for each scale. Generalized estimating equations (GEE) was used to correct for the effect of patients contributing changes scores to several CCGs (and more than one to specific CCG)  Furthermore, we checked whether MIDs varied by trial in a regression model. To account for multiple testing (EORTC QLQ-C30 scales) statistical significance was set at 1%.
For the distribution-based approach 0.3 SD, 0.5 SD and standard error of measurement (SEM) were estimated at t1 (i.e. before or on the first day of treatment). As an effect size (ES) measure within CCGs the means of the HRQOL change scores were divided by the standard deviations (SD) of the HRQOL change scores over all time points. ES of 0.2 were considered small, 0.5 moderate and ≥ 0.8 large  and only anchor-based MIDS with mean changes with ES between 0.2 and 0.8 were considered appropriate for inclusion as MIDs.
A total of 1937 patients were enrolled in both trials. Patient characteristics at baseline are summarised in Table 1. The median follow-up time (in months) for HRQOL saw 36.2 (SD = 23.4) and 38.5 (SD = 34.6) for trials 1 and 2 respectively. An overview of patient inclusion in the various analysis steps is summarised in Fig. A.1.
Fourteen potential clinical anchors were initially evaluated for the EORTC QLQ-C30 scales. After retaining anchors with cross-sectional correlation ≤0.3, and seeking clinical input to confirm their clinical relevance, PS and CTCAE diarrhoea were retained. PS was scored between 0 (no symptoms of cancer) and 4 (bedbound) and CTCAE diarrhoea graded between 0 (no toxicity) to 4 (life-threatening). As shown in Table 2, a clinical anchor was found for 7 of the 14 scales considered, with cross-sectional correlations ranging from 0.3 to 0.55 in absolute value, and the correlations between their change scores ranging from 0.2 to 0.4.
According to the anchor change scores, the majority of patients remained stable over time compared to patients who either improved or deteriorated (Table A.1). Anchor-based MIDs that are derived from anchor CCGs with a clinically important ES (≥ 0.2 and < 0.8) are summarised in Table 3. The full results across all CCGs are presented in Table A.2. Anchor-based MIDs were determined for deterioration in 7 EORTC QLQ-C30 scales, and in 3 scales for improvement. The MID estimates varied by scale, direction of change (improvement versus deterioration), and were always in the expected direction, i.e. positive versus negative mean change scores within the improvement versus deterioration CCGs respectively. Within-group MIDs (from the mean-change method) ranged from 4 to 14 points for improvement and − 13 to − 5 points for deterioration, while MIDs for between-group change (from the linear regression) ranged from 3 to 13 for improvement and − 10 to − 5 for deterioration. The interaction effects between the binary anchor variable and the trial indicator showed no statistically significant differences for both improving and deteriorating scores (results not shown). This implies the estimated MIDs did not depend on the trial. In comparison to the distribution-based estimates presented in Table 3, apart from the diarrhoea scale, anchor-based MIDs for improvement were closer to 0.3 SD. For deterioration, anchor-based MIDs for diarrhoea, physical and role functioning scales were closer to 0.5 SD, while estimates for the remaining scales ranged between 0.3 SD and 0.5 SD. Distribution-based estimates for all 14 EORTC QLQ-C30 scales that were considered in this study are presented in Table A.3.
Our analyses were part of an EORTC project  on MID development for the QLQ-C30 scales in various cancer entities and adds prostate-specific MIDs to the EORTC MID portfolio.
The main results of the study are anchor-based MIDs for deterioration for seven QLQ-C30 scales (physical functioning, role functioning, social functioning, pain, fatigue, global quality of life, diarrhoea), and for improvement for three QLQ-C30 scales (role functioning, social functioning, diarrhoea) both for within-group and between-group differences. MIDs varied by scale and direction (between 5 and 13 points for deterioration and 4 and 10 points for improvement), whereby the direction was always in accordance with the anchor change category (i.e. anchor scores indicating a low health status were associated with lower HRQOL scores). This compares well to MIDs already developed in this EORTC project for head and neck cancer , advanced breast cancer , malignant melanoma , colorectal  and ovarian  as well as to other similar research [26,27,28]. With two exceptions (global quality of life, diarrhoea), these MIDs were larger for deterioration compared to improvements. This aligns with existing findings even beyond the QLQ-C30 [17, 20, 21], suggesting that patients may have a higher sensitivity to favourable differences [26, 29, 30]. However this effect is not universal as other studies have reported no systematic differences in the magnitude of change between deteriorating and improving scores [15, 19, 22].
Overall, our MID estimates, with few exceptions, lie between 5 and 10 points, which corresponds to the thresholds suggested by Osoba et al. in 1998  where patients’ reports on subjective change were used as clinical anchors. While these thresholds had been developed in breast and small-cell lung cancer patients, they have also been observed in various other cancer sites [21,22,23, 26,27,28, 31].
There seems to be a certain universality of an MID of 5–10 points on QLQ-C30 scales, but smaller and larger MIDs have been repeatedly found, especially for role functioning [20, 23] including the present study. This highlights that the scales and different sites are not to be tarred with the same brush.
There are though some limitations to be considered when interpreting the presented results.
Most importantly, after careful evaluation of 14 potential clinical anchors, only the CTCA diarrhoea scale and the WHO PS were suitable for MID estimation as the others showed low correlations with HRQOL scales. A reason may be found in certain insensitivity of these rating systems to HRQOL differences due to a low interrater reliability in toxicity identification with CTCAE  or somewhat wide WHO PS categories (e.g. between 0- fully active and 1-able to carry out light work). Ideally, multiple anchors including patient self-reports which might be able to shed some light on the issue of subjectively perceived change on different scales would be considered. Furthermore, it has to be noted that in the present statistical approach ordinal scales are treated as interval scales, disregarding the fact that a difference between “not at all” and “a little” might be different from the difference between “quite a bit” and “very much”. This is where item-response-theory based methods can provide valuable information in future research. Finally, only two trials could be included, none of which was covering metastasised disease. Hence, the application of the here developed anchor-based MIDs to a prostate cancer population with stage IV disease needs to be done with caution. A further limitation is that, based on the available data, no anchor-based MIDs for improvement could be developed for some scales. This needs to be covered by future research along with the investigation of additional anchors to further approach the concept of minimal change. Meanwhile, the presented distribution-based MIDs may provide some guidance.
It is a strength of the present study, though, that MIDs did not vary across the different data sources, i.e. a trial in locally advanced prostate cancer on the effect of androgen suppression and a trial on effectiveness of radiation therapy with or without bicalutamide and goserelin in localized prostate cancer, indicating a certain stability of the estimated values. Our results may therefore support sound hypothesis for HRQOL in clinical trials targeting similar patient groups.
In general, it is acknowledged that MIDs are dynamic and that we should not be expecting one single MID for each scale of an instrument, nor should we expect them to be the same across different conditions. Therefore, the proper application of MIDs always includes the careful selection of the most appropriate estimate, considering the specific condition and decision context. Note that the current findings are part of a larger project that aims to develop an evidence-based MID catalogue that is more refined than the commonly used single value rule-of-thumb. We aim to further perform a comprehensive synthesis of MID estimates to identify plausible ranges based on patterns across multiple cancer sites, and to expand the estimation methodology beyond retrospective clinical anchors.
In conclusion, the MIDs presented here contribute to the meaningful interpretation of group-level changes (mostly deterioration) on a set of QLQ-C30 scales in prostate cancer patients undergoing treatment and may facilitate more accurate sample size estimation in trials with HRQOL endpoints. They may also be useful benchmarks in clinical practice where they can help the early detection of patients with relevant changes of health status. Further research is needed to confirm our findings and to extend the MID set for improvements, which may be important to detect relevant in early stage prostate cancer and survivors.
Availability of data and materials
The data that support the findings of this study are available from the corresponding author upon reasonable request (See https://www.eortc.org/data-sharing/).Code is available upon request.
European Organisation for Research and Treatment of Cancer
Quality of Life Questionnaire Core 30
clinical change groups
common terminology criteria for adverse events
health-related quality of life
minimal important difference
numbers needed to treat
standard error of measurement
Yost KJ, et al. Minimally important differences were estimated for the functional assessment of cancer therapy-colorectal (FACT-C) instrument using a combination of distribution- and anchor-based approaches. J Clin Epidemiol. 2005;58(12):1241–51.
Lydick E, Epstein RS. Interpretation of quality of life changes. Qual Life Res. 1993;2(3):221–6.
Guyatt GH, et al. Methods to explain the clinical significance of health status measures. Mayo Clin Proc. 2002;77(4):371–83.
Schünemann HJ, Guyatt GH. Commentary–goodbye M(C)ID! Hello MID, Where do you come from? Health Serv Res. 2005;40(2):593–7.
Schunemann HJ, Akl EA, Guyatt GH. Interpreting the results of patient reported outcome measures in clinical trials: the clinician's perspective. Health Qual Life Outcomes. 2006;4(1):62. https://doi.org/10.1186/1477-7525-4-62.
Revicki D, et al. Recommended methods for determining responsiveness and minimally important differences for patient-reported outcomes. J Clin Epidemiol. 2008;61(2):102–9.
King MT. A point of minimal important difference (MID): a critique of terminology and methods. Expert Rev Pharmacoecon Outcomes Res. 2011;11(2):171–84.
Giesinger JM, Efficace F, Aaronson N, Calvert M, Kyte D, Cottone F, Cella D, Gamper EM. Past and current practice of patient-reported outcome measurement in randomized cancer clinical trials: a systematic review. Value Health. 2021 Apr;24(4):585–91.
Osoba D, Rodrigues G, Myles J, Zee B, Pater J. Interpreting the significance of changes in health-related quality-of-life scores. J Clin Oncol. 1998;16(1):139–44. https://doi.org/10.1200/JCO.19188.8.131.52.
Cocks K, et al. Evidence-based guidelines for interpreting change scores for the European organisation for the research and treatment of Cancer quality of life questionnaire Core 30. Eur J Cancer. 2012 Jul;48(11):1713-21.
King MT. The interpretation of scores from the EORTC quality of life questionnaire QLQ-C30. Qual Life Res. 1996;5(6):555–67. https://doi.org/10.1007/BF00439229.
Musoro ZJ, et al. Establishing anchor-based minimally important differences (MID) with the EORTC quality-of-life measures: a meta-analysis protocol. BMJ Open. 2018 Jan 10;8(1):e019117.
Siegel RL, Miller KD, Jemal A. Cancer statistics, 2020. CA Cancer J Clin. 2020 Jan;70(1):7–30.
Eton DT, Lepore SJ. Prostate cancer and health-related quality of life: a review of the literature. Psycho-Oncology. 2002;11(4):307–26. https://doi.org/10.1002/pon.572.
Luz MA, et al. Consensus on prostate Cancer treatment of localized disease with very low, low, and intermediate risk: a report from the first prostate Cancer consensus conference for developing countries (PCCCDC). JCO Glob Oncol. 2021 Apr;7:523–9.
Pratsinis M, Halabi S, Güsewell S, Gillessen S, Omlin A. In-depth analysis of the 2019 advanced prostate Cancer consensus conference: the importance of representation of medical specialty and geographic regions. Eur Urol Open Sci. 2021;26:14–7. https://doi.org/10.1016/j.euros.2021.01.010.
Bolla M, et al. Duration of androgen suppression in the treatment of prostate cancer. N Engl J Med. 2009;360(24):2516–27.
Bolla M, et al. Short Androgen Suppression and Radiation Dose Escalation for Intermediate- and High-Risk Localized Prostate Cancer: Results of EORTC Trial 22991. J Clin Oncol. 2016;34(15):1748–56.
Fayers P, Aaronson NK, Bjordal K, Groenvold M, Curran D, Bottomley A. E ORTC QLQ-C30 Scoring Manual. 3rd ed. European Organisation for Research and Treatment of Cancer; 2001.
Musoro JZ, et al. Interpreting European Organisation for Research and Treatment for Cancer Quality of life Questionnaire core 30 scores as minimally importantly different for patients with malignant melanoma. Eur J Cancer. 2018;104:169–81.
Musoro JZ, et al. Minimally important differences for interpreting EORTC QLQ-C30 scores in patients with advanced breast cancer. JNCI Cancer Spectr. 2019;3(3):pkz037.
Musoro JZ, Coens C, Singer S, Tribius S, Oosting SF, Groenvold M, et al. Minimally important differences for interpreting European Organisation for Research and Treatment of Cancer quality of life questionnaire Core 30 scores in patients with head and neck cancer. Head & neck. 2020;42(11):3141–52. https://doi.org/10.1002/hed.26363.
Musoro JZ, Sodergren SC, Coens C, Pochesci A, Terada M, King MT, et al. Minimally important differences for interpreting the EORTC QLQ-C30 in patients with advanced colorectal cancer treated with chemotherapy. Colorectal disease : the official journal of the Association of Coloproctology of Great Britain and Ireland. 2020;22(12):2278–87. https://doi.org/10.1111/codi.15295.
Liang KY, Zeger SL. Regression analysis for correlated data. Annu Rev Public Health. 1993;14(1):43–68. https://doi.org/10.1146/annurev.pu.14.050193.000355.
Cohen J. Statistical Power Analysis for the Behavioural Sciences. 2nd ed. Hillsdale: Lawrence Erlbaum Associates; 1988.
Cocks K, et al. Evidence-based guidelines for interpreting change scores for the European Organisation for the Research and Treatment of Cancer Quality of Life Questionnaire Core 30. Eur J Cancer. 2012;48(11):1713–21.
Cocks K, et al. Evidence-based guidelines for determination of sample size and interpretation of the European Organisation for the Research and Treatment of Cancer Quality of Life Questionnaire Core 30. J Clin Oncol. 2011;29(1):89–96.
Maringwa J, et al. Minimal clinically meaningful differences for the EORTC QLQ-C30 and EORTC QLQ-BN20 scales in brain cancer patients. Ann Oncol. 2011;22(9):2107–12.
Ringash J, et al. Interpreting clinically significant changes in patient-reported outcomes. Cancer. 2007;110(1):196–202.
Cella D, et al. Clinical consensus meeting group. Group vs individual approaches to understanding the clinical significance of differences or changes in quality of life. Mayo Clin Proc. 2002;77:384–92.
Maringwa JT, et al. Minimal important differences for interpreting health-related quality of life scores from the EORTC QLQ-C30 in lung cancer patients participating in randomized controlled trials. Support Care Cancer. 2011;19(11):1753–60.
Fairchild AT, et al. Interrater reliability in toxicity identification: limitations of current standards. Int J Rad Oncol Biol Phys. 2020;107(5):996–1000.
We thank the EORTC Genito-Urinary Tract Cancer Group members and their clinical investigators, and all the patients who participated in the trials that we used for this analysis. The present project was supported by a grant by an EORTC research grant (EORTC MID Grant).
This study was funded by the EORTC Quality of Life Group. The sponsor took part in the design of the study and approved the final version of the manuscript.
Ethics approval and consent to participate
This is a retrospective analyses of two EORTC trials (EORTC 22961 and EORTC 22991). Approval for the study was waived by EORTC headquarter. The use of the patient data from the various studies fell under their original informed consent wording and no additional patient consent was needed from patient or local ethical committees. Data sharing and analysis was carried out under the General Data Protection Regulation of the European Union.
Consent for publication
No author reports any competing interests. Jammbe Z Musoro, Corneel Coens, Claudette Falato and Andrew Bottomley are employed by the European Organisation for Research and Treatment of Cancer (EORTC), Brussels, Belgium.
Dr. Velikova reports personal fees from Roche, personal fees from Eisai, personal fees from Novartis, grants from Breast Cancer Now, grants from EORTC, grants from Yorkshire Cancer Research, grants from Pfizer, outside the submitted work.
Dr. Gamper reports grants from EORTC, outside the submitted work.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Gamper, E.M., Musoro, J.Z., Coens, C. et al. Minimally important differences for the EORTC QLQ-C30 in prostate cancer clinical trials. BMC Cancer 21, 1083 (2021). https://doi.org/10.1186/s12885-021-08609-7
- Health-related quality of life
- Interpretation of scores
- Patient-reported outcomes
- European Organisation for Research and Treatment of Cancer