Skip to main content

Table 5 Psychometric Qualities of Identified Instruments

From: Health-related quality of life in women with breast cancer: a review of measures

Instrument Reference

Reliability

Validity

EORTC QLQ-BR23

 Alawadhi et al. [10]

▪ The intraclass correlation for the test-retest statistic and the internal consistency values for the multi-item scales was > 0.7

▪ With the exception of the pain subscale, all items met the item internal consistency criterion of > 0.4 correlation with the corresponding scale.

▪ The QLQ-BR23 performed better than the QLQ-C30 for item discriminant validity

▪ The scale scores discriminated between patients at different disease stages and between sick and well populations.

 Bener et al. [11]

▪ 6 of the 9 subscales met the standards of reliability, with coefficients ranging from 0.55 to 0.89

▪ Advanced breast cancer stages of III-IV had significantly higher symptomatic scores than those in early stages for the physical function, cognitive, fatigue, insomnia, appetite loss, constipation, and financial difficulties.

▪ Correlation coefficients between each item ranged from − 0.113 to 0.960, and item 21 (tense) and item 23 (irritable) had strongest negative correlations with their corresponding emotional functioning subscale, whereas items 29 (physical condition) and 30 (overall QoL) had the strongest positive correlation with Global Health/QoL subscale.

▪ Item 6 (limited work) showed a higher correlation with fatigue (r = 0.749).

▪ Item 19 (pain interfered with daily activities) of the pain subscale had higher correlations with physical functioning, role functioning, and fatigue subscales.

 Bjelic-Radisic et al. [12]

▪ NR

▪ Phases 1 and 2 results indicated the need to supplement the original QLQ-BR23, with additional items related to newer therapeutic options.

▪ The phase 3 study recruited a total of 250 patients from 12 countries. After the qualitative and quantitative analyses, the final updated phase 3 module contained a total of 45 items: 23 items from the QLQ-BR23 and 22 new items. The new items contain two multi-item scales: target symptom scale (20 items) and satisfaction scale (2 items). The target symptom scale can be further divided into 3 subscales: endocrine therapy scale, endocrine sexual scale, and skin/mucosa scale.

 Cerezo et al. [13]

▪ Cronbach alpha of all multi-item scales showed values ≥0.7, except for Cognitive and Breast symptoms scales (0.52 and 0.65, respectively)

▪ Convergent and divergent validity was adequate

▪ Patients with early stages (n = 77) showed better functional scores and lower symptoms scores than patients with locally advanced breast cancer (n = 157)

▪ Score means variation after responsiveness analysis demonstrated high sensitivity to change after breast cancer surgery

 El Fakir et al. [14]

▪ Cronbach alpha coefficient were all > 0.7, except for breast symptoms and arm symptoms

▪ All items exceeded the 0.4 criterion for convergent validity, except items 20 and 23 related to pain and skin problems in the affected breast, respectively

 Keilmann et al. [15]

▪ Statistic differences could not be seen in the majority of the single items (27/30) nor in one of the scales, investigating the parallel form reliability; the test of consistency showed in 29 of 30 single items and 12 of 15 scales statistically significant correlations

▪ NR

 Michels et al. [16]

▪ Cronbach alpha for the EORTC QLQ-C30 ranged from 0.72 to 0.86 and from 0.78 to 0.83 for the EORTC QLQ-BR23

▪ Most questions were confirmed in the confirmatory factorial analysis

▪ In the construct validity analysis, the questionnaires were capable of differentiating patients with or without lymphedema, apart from the symptom scales of both questionnaires

▪ Both questionnaires presented a significant correlation in most domains of the SF-36 in the convergent validity analysis

▪ Only a few criticisms were reported concerning questions, and the mean grade of understanding was high (QLQ-C30 = 4.91 and QLQ-BR23 = 4.89)

 Shuleta-Qehaja et al. [17]

▪ Cronbach alpha ranged from 0.54 for the cognitive functioning scale to 0.96 for the global health quality of life (GH/QoL) scale

▪ In multitrait scaling analysis, the strength of Spearman correlations between an item and its own subscale was ≥0.40, with the exception of item 5 (Ρ = 0.22); results for item discriminant validity were satisfactory, with the exception of item 5, which showed higher correlation with other subscales than with its own physical functioning.

▪ The Spearman interscale coefficients generally were correlated with each other. Results of known-group comparisons did not show significant differences in terms of disease stage. Regarding education level, patients with high school/university education had better functional scale scores only in certain subscales compared with other subgroups; furthermore, patients with secondary school education had better GH/QoL compared with other subgroups of patients.

 Simons [18]

▪ NR

▪ Nausea and vomiting were positively correlated with the reported incidence of nausea as an adverse event (0.126 [P = 0.03]) and vomiting (P < 0.01); false-positives were negatively correlated (− 0.329 [P < 0.01] and − 0.352 [P < 0.01], respectively).

▪ Constipation also satisfied criteria for content validity for correspondence (0.189 [P < 0.01]) and for lack of false-positives (− 0.394 [P < 0.01]).

▪ Diarrhea had a correspondence of 0.292 (P < 0.01) and specificity of − 0.260 (P < 0.01).

▪ Dyspnea had a correspondence of 0.226 (P < 0.01) and a specificity of − 0.349 (P < 0.01).

▪ Insomnia failed the criteria.

▪ Upset by hair loss was weakly correlated with alopecia but very specific (− 0.479; P < 0.01).

 Snyder et al. [19]

▪ The same six QLQ-C30 domains with area under the curve (AUC) values ≥0.70 in the original analysis had AUC values ≥0.70 in the replication sample

▪ Cutoff scores were identified with sensitivity ≥0.84 and specificity ≥0.54

 Tan et al. [20]

▪ Cronbach alpha coefficient results for EORTC QLQ-C30 and QLQ-BR23 were 0.846 and 0.873, respectively

▪ The correlation between EORTC QLQ-C30 and EQ-5D QoL instruments demonstrated a modest linear relationship (r = 0.597; P < 0.001) that indicated a moderately strong correlation between the two measures

 Wallwiener et al. [21]

▪ No differences in terms of acceptance between paper and electronic patient-reported outcome

▪ No significant different in response behavior between paper and electronic patient-reported outcome

▪ NR

 Wallwiener et al. [22]

▪ High correlations were shown for both dimensions of reliability (parallel forms reliability and internal consistency) in the patient’s response behavior between paper- and electronic-based questionnaires

▪ Regarding the test of parallel forms reliability, no significant differences were found in 27 of 30 single items and in 14 of 15 scales, whereas a statistically significant correlation in the test of consistency was found in all 30 single items and all 15 scales

▪ NR

 Zhang et al. [23]

▪ Cronbach alpha coefficients were close to or greater than 0.7, except for breast symptoms (0.615)

▪ Multitrait scaling analysis demonstrated a good convergent and divergent validity of EORTC QLQ-BR23 and EORTC QLQ-C30

▪ Using SF-36 as a reference standard to evaluate the dimensions of EORTC QLQ-BR23, most items in EORTC QLQ-BR23 possessed a favorable correlation with its own dimension (r > 0.4)

▪ A statistically significant difference was discovered in dimension scores between patients grouped by ECOG scores except for individual dimensions

FACT-B

 Algamdi and Hanneman [24]

▪ Cronbach alpha was 0.91 for the FACT-BA, and 0.43–0.89 for the FACT-BA subscales

▪ NR

 Cheung et al. [25]

▪ NR

▪ In a cross-sectional setting, the differences in the effect size favored EQ-5D-5L and the 90% CIs totally fell within the zone that indicated the noninferiority of the EQ-5D-5L (e.g., oncologist-assessed performance status: − 0.26 to 0.04; patient-assessed performance status: − 0.48 to − 0.16; current evidence of disease: − 0.28 to 0.08). In a longitudinal setting, the FACT-B showed larger effect sizes and ICCs than the EQ-5D-5L. The 90% CIs, however, overlapped the noninferiority margin, thus noninferiority in these two aspects could not be confirmed

 Jarkovsky et al. [26]

▪ Similar to other validations of FACT-B translations; good reliability, sensitivity, and reliable internal structure after translation

▪ NR

 Kobeissi et al. [27]

▪ NR

▪ The following questions were perceived to be most important: ability to meet the needs of my family, pain, emotional support, worry that my condition will get worse, sleep, worry that other family members will get the disease, change in weight, and pain in different areas of the body.

▪ Instrument was perceived to be adequate, appropriate for use, culturally sensitive, simple, and exhaustive.

 Lee et al. [28]

▪ For test-retest reliability, the confidence intervals of the differences in ICC overlapped the noninferiority margin

▪ Using performance status, evidence of disease, and treatment status as criteria, the differences (FACT-B minus EQ-5D-3L) in the effect size for discriminative ability were negative or close to, 0 and the 90% confidence intervals (CIs) fell within the zone that indicated noninferiority of EQ-5D-5L

▪ For responsiveness, the CIs of the differences in effect size overlapped the noninferiority margin (difference in effect size (90% CI), FACT-B vs. EQ-5D-5L [0.04 (− 0.79, 0.95])

 Matthies et al. [29]

▪ High correlations were shown for both dimensions of reliability (parallel forms reliability and internal consistency) in the patients’ response behavior between paper-based and electronically based questionnaires; regarding the reliability test of parallel forms, no significant differences were found in 35 of 37 single items, while significant correlations in the test for consistency were found in all 37 single items, in all 5 sum individual item subscale scores, and in total FACT-B score

▪ NR

 Ng et al. [30]

▪ Cronbach alpha for the FACT-B total score and Trial Outcome Index were 0.91 and 0.87 for the English-speaking sample, respectively; for the Chinese-speaking sample, they were both 0.88

▪ The ICCs for the FACT-B total score and Trial Outcome Index were 0.82 (95% CI, 0.74–0.87) and 0.84 (95% CI, 0.77–0.89), respectively, for the English-speaking sample; they were 0.88 (95% CI, 0.80–0.93) and 0.89 (95% CI, 0.81–0.94), respectively, for the Chinese-speaking sample

▪ The FACT-B total score and Trial Outcome Index demonstrated known-group validity in differentiating patients with different clinical status.

▪ The English version was responsive to the change in performance status. The Chinese version was shown to be responsive to decline in performance status, but the sample size of Chinese-speaking patients who improved in performance status was too small (N = 6) for conclusive analysis about responsiveness to improvement.

▪ Two items concerning sexuality had a high item nonresponse rate (50.2 and 14.4%).

▪ No practically significant difference was found in the total score and in the Trial Outcome Index between the two language versions despite minor differences in 2 of the 37 items.

▪ The English and Chinese versions of the FACT-B are valid, responsive, and reliable instruments in assessing health-related quality of life in Singaporean patients with breast cancer.

 Patoo et al. [31]

▪ Internal consistency using Cronbach alpha was 0.63 to 0.93 for the subscales and 0.92 for the total scale

▪ Significant correlations between FACT-B and other measures indicate that this scale had concurrent and discriminant validity. The values of fit indices were satisfactory

FBSI

 Lee et al. [32]

▪ For both language versions, the FBSI demonstrated sufficient test-retest reliability (ICC = 0.75–0.77)

▪ For both language versions, the FBSI demonstrated known-group validity and convergent and divergent validity

▪ The English version was responsive to changes in performance status

▪ The Chinese version was responsive to decline in performance status, but there was no conclusive evidence about its responsiveness to improvement in performance status

▪ No practical significant difference was found in the outcomes between the two language versions despite minor difference in one item

▪ The FBSI performed comparably with the FACT-B

NFBSI-16

 Garcia et al. [33]

▪ Results provide preliminary support for internal consistency reliability (0.87) of the NFBSI-16

▪ Selected breast cancer–related symptoms and concerns endorsed as high priority by both oncology patients and clinicians for inclusion in the new NFBSI-16, which includes all 8 items from the original FBSI and 8 additional items from FACT-B measures

▪ The NFBSI-16 is formatted by subscale: Disease-Related Symptom, Treatment Side Effect, and General Function and Well-Being

▪ Validity was evidenced by moderate-to-strong relationships with expected criteria

 Krohe et al. [34]

▪ NR

▪ All patients for whom data were available demonstrated understanding of the instructions and the recall period of the NFBSI-16 (n = 14/14, 100.0%) and the PROMIS (n = 14/14, 100.0%).

▪ > 90% of patients demonstrated understanding of each of the items in the NFBSI-16 and the PROMIS.

▪ > 70% of patients demonstrated understanding of the response options of the NFBSI-16, > 90% understood response options of PROMIS items 1–6, and ≥ 50% understood response options of PROMIS items 7–10.

▪ Conceptual relevance was supported for most items in both questionnaires based on patients’ reports of experiencing the concepts as part of their breast cancer experience.

YW-BCI36

 Christophe et al. [35]

▪ Internal consistency (Cronbach alpha values ranging from 0.76 to 0.91)

▪ Temporal reliability (Bravais-Pearson correlations ranging from 0.66 to 0.85)

▪ As expected, there were quite strong correlations between the Young Women With Breast Cancer Inventory and the QLQ-C30 and QLQ-BR23 scores (r ranging from 0.20 to − 0.66), indicating adequate concurrent validity

Breast Cancer Symptom Scale

 Horigan et al. [36]

▪ NR

▪ The 9 highest ranked items include: good QoL, maintaining independence, able to sleep, able to concentrate, perform normal activities, being fatigued, having depression, being anxious, and having pain.

▪ The 5 lowest ranked items include: appetite, breast-specific issues, hot flashes, and sexuality.

▪ Ratings by breast cancer subset (newly diagnosed, on treatment, no evidence of disease, hormonal or nonhormonal treatment, metastatic disease, survivors) showed some differences compared with the whole group.

QLICP-BR

 Wan et al. [37]

▪ Test-retest reliability for the overall scale and 5 domains are all > 0.75 (overall scale: 0.88)

▪ Internal consistency alpha for each domain is > 0.65, except social domain (0.58)

▪ Most correlation coefficients between each item and its domain are > 0.60.

▪ Overall the correlations between the same and similar domains (between QLICP-BR and QLQ-C30 and QLQ-BR23) are higher than those between different and nonsimilar domains.

▪ The score differences between pretreatment and posttreatment for overall scale, general module, physical domain, psychological domain, and social domain have statistical significance.

QuEST-Br

 Harley et al. [38]

▪ Internal consistency was high for all subscales (Cronbach alpha: range, 0.81–0.93)

­ Strenuous activities: 0.87

­ Everyday tasks: 0.83

­ Pain: 0.89

­ Fatigue: 0.93

­ Impact on activities: 0.88

­ MHI-5: 0.81

­ EORTC emotion function: 0.88

­ Body image: 0.92

▪ Item-convergent validity (item to own scale; correlation corrected for overlap)

­ Strenuous activities: 0.66–0.76

­ Everyday tasks: 0.55–0.72

­ Pain: 0.80

­ Fatigue: 0.81–0.86

­ Impact on activities: 0.67–0.80

­ MHI-5: 0.43–0.67

­ EORTC emotion function: 0.69–0.78

­ Body image: 0.81–0.86

INA-BCHRQoL

 Saptaningsih et al. [39]

▪ Cronbach alpha for physical, cognitive, social, and spiritual domain were higher than 0.8, and the corrected item-total correlation was also higher than 0.3

▪ Each domain of the questionnaire was not influenced by the treatment options.

▪ 24 patients with early stage breast cancer (10 FAC based chemotherapy and 14 taxan-based chemotherapy) were enrolled in the main study, and the score of HRQoL obtained from INA-BCHRQoL was considerably high.

Unnamed

 Deshpande et al. [7]

▪ Cronbach alpha value for the questionnaire was 0.93

▪ Patients understood the questionnaire and found the items to be relevant, indicating content validity.

▪ The statistical assessment of the scores did not show the association between scores with age or stage of breast cancer, as sample size was small.

 Vanlemmens et al., [9]

▪ Participants reported on 8 dimensions of their quality of life during treatment and follow-up: psychological, physical, family, social, couple, sexuality, domestic, professional, economic

▪ Very few differences were found between the 4 groups (chemotherapy, Herceptin, hormonotherapy, or follow-up) except that patients receiving chemotherapy and patients receiving Herceptin referred more to physical dimension than the group under follow-up

 Vanlemmens et al., [8]

▪ Internal consistency (Cronbach alpha) ranged from 0.76 to 0.91

▪ Test-retest ICC ranged from 0.662 to 0.855

▪ As expected, convergent validity showed strong correlations with quality of life measures (EORTC QLQ-C30).

  1. ECOG Eastern Cooperative Oncology Group, EORTC European Organization for Research and Treatment of Cancer, EQ-5D-3L EuroQoL 3-level 5-dimension, EQ-5D-5L EuroQoL 5-level 5-dimension, FAC fluorouracil, doxorubicin, and cyclophosphamide, FACT-B Functional Assessment of Cancer Therapy-Breast, FBSI Functional Assessment of Cancer Therapy-Breast Symptom Index, HRQoL health-related quality of life, ICC intraclass correlation coefficient, INA-BCHRQoL Indonesian Breast Cancer Health-Related Quality of Life, MHI-5 Mental Health Inventory-5, NFBSI-16 National Comprehensive Cancer Network-Functional Assessment of Cancer Therapy-Breast Cancer Symptom Index-16, NR not reported, PROMIS Patient-Reported Outcomes Measurement Information System, QLICP-BR Quality of Life Instruments for Cancer Patients-Breast Cancer, QLQ-BR23 Breast Cancer–Specific Quality of Life Questionnaire-23 item, QLQ-C30 Quality of Life Questionnaire, Version 3.0, QoL quality of life, QuEST-BR QuEST Breast Cancer Questionnaire, SF-36 Short Form Health Survey